Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on AIME '25 (@1, @32)

32.3Accuracy @1

AVERAGE

21.17224.06126.9529.839Apr 1, 2026
Updated 16d ago

Evaluation Results

MethodLinks
2026.04
32.366.7
2026.04
3263.3
2026.04
31.553.3
2026.04
31.166.7
2026.04
30.766.7
2026.04
29.866.7
2026.04
27.860
2026.04
21.653.3