Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mathematical Reasoning on Beyond-AIME (Avg@5)

77.6Avg@5 Score

GPT-5-high

23.72837.71451.765.686Jan 23, 2026
Updated 3d ago

Evaluation Results

MethodLinks
2026.01
77.6-----
2026.01
77.3-----
2026.01
76.6-----
2026.01
75.2-----
2026.01
72.2-----
2026.01
71.8-----
2026.01
71.6-----
2026.01
70.6-----
2026.01
69.4-----
2026.01
68.6-----
2026.01
68.6-----
2026.01
68.2-----
2026.01
68-----
2026.01
67.1-----
2026.01
65.4-----
2026.01
64.2-----
2026.01
63.6-----
2026.01
62.2-----
2026.01
61.8-----
2026.01
60.1-----
2026.01
58.8-----
2026.01
57.9-----
2026.01
57.7-----
2026.01
57.2-----
2026.01
56.8-----
2026.01
56.3-----
2026.01
55.6-----
2026.01
55-----
2026.01
54.8-----
2026.01
54.8-----
2026.01
54.4-----
2026.01
53.6-----
2026.01
52.8-----
2026.01
52.2-----
2026.01
52.2-----
2026.01
51.6-----
2026.01
51.6-----
2026.01
50.2-----
2026.01
48.6-----
2026.01
47.6-----
2026.01
46.6-----
2026.01
43.8-----
2026.01
37-----
2026.01
33-----
2026.01
32.6-----
2026.01
30.4-----
2026.01
28.8-----
2026.01
25.8-----
-57.257.70.552.2-5
2026.01
-68.267.1-1.163.6-4.6
-71.677.35.775.23.6
-51.656.34.750.2-1.4
-54.457.93.551.6-2.8
-69.468.6-0.864.2-5.2
-58.861.8355-3.8
-54.843.8-1146.6-8.2
2026.01
-56.852.8-447.6-9.2
-76.672.2-4.465.4-11.2
2026.01
-6877.69.670.62.6
-3730.4-6.625.8-11.2
-53.660.16.552.2-1.4
-28.8334.232.63.8
-54.855.60.848.6-6.2
-71.868.6-3.262.2-9.6