Share your thoughts, 1 month free Claude Pro on usSee more

Mathematical Reasoning on AIME (Pass@1 Accuracy, Length Exceeding Ratio)

63.7Pass@1 Accuracy

-

Updated 4mo ago

Evaluation Results

Method	Links
- 2026.01		63.7	71.3
GDPO 2026.01		56.9	0.1
- 2026.01		55.4	85.6
GRPO 2026.01		54.6	2.5
GDPO 2026.01		53.1	0.2
GRPO 2026.01		50.2	2.1
- 2026.01		29.8	91.5
GDPO 2026.01		29.4	6.5
GRPO 2026.01		23.1	10.8