Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on AIME 2025 (Avg@1/8 e_m, Pass@1/8 e_m)

72.9Avg@8 (e_m)

GPT-5 nano

-2.91616.76736.4556.133May 8, 2026May 9, 2026May 10, 2026May 11, 2026May 12, 2026
Updated 20d ago

Evaluation Results

MethodLinks
2026.05
72.973.373.386.7
66.776.776.793.3
2026.05
61.263.363.376.7
2026.05
58.3606070
31.7303050
26.726.726.726.7
2026.05
25.8--36.66
2026.05
25.41--36.28
2026.05
17.92--32.57
17.513.313.320
13.313.313.313.3
9.66.76.710
2026.05
7.81--16.66
0000
0000
2026.05
0000
0000
0000
0000
0000