Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on MATH (Accuracy, Response Tokens, Length Reduction)

46.74Accuracy

GRPO+FIRSTN

44.680845.215445.7546.2846May 27, 2026
Updated 6d ago

Evaluation Results

MethodLinks
2026.05
46.74371.498
2026.05
46.07403.94-
2026.05
45.28359.810.9
2026.05
45.02163.1559.6
2026.05
44.76154.161.9