Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematics on AIME25 (Accuracy)

93.33Accuracy

gpt-oss-puzzle-88B

-3.733221.465946.66571.8641Jan 27, 2026Feb 8, 2026Feb 20, 2026Mar 4, 2026Mar 16, 2026Mar 28, 2026Apr 10, 2026
Updated 5d ago

Evaluation Results

MethodLinks
2026.02
93.33
2026.02
92.92
2026.04
91.8
2026.02
91.46
2026.04
91.4
2026.02
89.58
2026.01
89.1
2026.01
87.9
2026.02
86.88
2026.02
86.67
2026.01
85
2026.01
83.3
2026.04
83
2026.04
78.6
2026.02
77.92
2026.02
76.88
2026.04
66.3
2026.02
66.25
2026.04
63.33
2026.04
63.33
2026.02
62.89
2026.03
60
2026.04
60
2026.04
60
2026.03
56.67
2026.04
53.33
2026.04
53.33
2026.02
51.88
2026.02
50
2026.04
50
2026.04
46.7
2026.04
46.7
2026.04
46.7
2026.04
46.67
2026.04
46.67
2026.04
40
2026.04
40
2026.04
40
2026.04
40
2026.04
40
2026.04
40
2026.04
36.67
2026.04
36.67
2026.04
36.67
2026.04
36.67
2026.04
36.67
2026.04
36.67
2026.04
33.33
2026.04
33.33
2026.04
33.33
2026.04
33.33
2026.04
30
2026.04
30
2026.04
26.67
2026.04
26.67
2026.03
23.33
2026.04
23.33
2026.04
23.33
2026.03
16.67
2026.04
16.67
2026.04
6.67
2026.04
0.42
2026.04
0