Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mathematical Reasoning on AIME 2025 (test)

88.9Pass@1 Rate

OpenAI-o3

11.00431.22751.4571.673Jan 21, 2026Jan 24, 2026Jan 28, 2026Jan 31, 2026Feb 4, 2026Feb 7, 2026Feb 11, 2026
Updated 3d ago

Evaluation Results

MethodLinks
2026.01
88.9-
2026.01
87.5-
2026.01
85.6-
2026.01
84.2-
2026.01
84-
2026.01
83-
2026.01
81.5-
2026.01
74.4-
2026.01
73.5-
2026.01
73.3-
2026.02
70-
2026.01
70-
2026.01
69.5-
2026.02
68.33-
2026.01
67.4-
2026.02
62.5-
2026.01
60-
2026.02
56.66-
2026.02
53.3-
2026.02
52.5-
2026.02
50-
2026.01
49.6-
2026.02
46.67-
2026.02
40.83-
2026.02
40.66-
2026.02
38-
2026.01
36.6643
2026.01
35.3354.3
2026.01
33.3610.7
2026.01
33.3633.1
2026.01
32.6634.7
2026.02
31.67-
2026.01
31.3643
2026.01
31.3647.6
2026.02
30-
2026.01
29.3443.5
2026.01
29.3648.5
2026.01
28.6213.9
2026.02
28.33-
2026.01
26434.7
2026.01
24.6436.3
2026.01
22.6423.4
2026.01
20445.1
2026.01
16.6445.7
2026.01
16.2-
2026.01
14.6441.3
2026.01
14-