Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Math Reasoning on AIME25 (Accuracy, Tokens)

30Accuracy

Static

11.816.52521.2525.975Apr 6, 2026
Updated 10d ago

Evaluation Results

MethodLinks
2026.04
3012,701
2026.04
309,429
2026.04
308,567
2026.04
309,222
2026.04
26.77,921
2026.04
26.76,813
2026.04
23.36,304
2026.04
23.36,837
2026.04
23.34,116
2026.04
23.35,018
2026.04
22.56,784
2026.04
22.57,084
2026.04
22.56,795
2026.04
22.56,140
2026.04
204,202
2026.04
206,286
2026.04
204,089
2026.04
17.54,616
2026.04
16.73,063
2026.04
16.73,203
2026.04
153,119
2026.04
12.52,483