Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mathematical Reasoning on AIME 25 (accuracy)

94.5Accuracy

APRM

52.84863.661574.47585.2885Nov 28, 2025Dec 10, 2025Dec 23, 2025Jan 5, 2026Jan 17, 2026Jan 30, 2026Feb 12, 2026
Updated 3d ago

Evaluation Results

MethodLinks
2025.11
94.5
2025.11
93.6
2026.01
93.3
2026.01
93.3
2026.01
93.3
2026.01
93.3
2025.11
92.8
2025.11
91.9
2025.11
91.6
2026.01
91.3
2025.11
91.3
2025.11
91
2025.11
90.8
2026.02
90.7
2025.11
90.7
2025.11
90.6
2025.11
90.4
2025.11
90.4
2025.11
89.5
2025.11
89.2
2025.11
88
2025.11
88
2025.11
87.5
2026.02
87.4
2026.02
87.4
2026.02
87.4
2026.02
87.3
2026.02
87.3
2026.02
86.7
2026.02
86.7
2026.02
86.7
2026.02
86.7
2026.02
86
2025.11
86
2025.11
85.5
2025.11
85
2026.02
83.3
2026.02
83.3
2026.02
83.3
2026.02
83.3
2026.02
82.7
2026.02
82.6
2026.02
82
2026.02
82
2025.11
82
2026.02
81.3
2026.02
81.04
2026.02
80.7
2025.11
80
2025.11
80
2026.02
78.33
2025.11
78
2026.02
77.4
2026.02
75
2025.11
75
2026.02
73.33
2026.02
73.3
2026.02
72.22
2026.02
72.08
2026.02
71.11
2026.02
70.7
2026.02
70
2026.02
70
2025.11
69.9
2025.11
69.9
2026.02
69.4
2026.02
68.06
2026.02
67.78
2026.02
67.77
2026.02
66.67
2025.11
65.7
2026.02
65.55
2026.02
63.89
2026.02
63.88
2026.02
62.97
2025.11
62.6
2025.11
62
2026.02
61.87
2025.11
61.4
2026.02
61.1
2026.02
61.1
2026.02
61.09
2026.02
60.39
2026.02
59.73
2026.02
59.72
2026.02
59.7
2026.02
58.33
2026.02
58.33
2026.02
58.33
2026.02
58.32
2025.11
58
2026.02
57.78
2026.02
57.12
2025.11
57
2026.02
56.94
2026.02
56.67
2025.11
55.4
2025.11
55
2026.02
54.59
2026.02
54.45
Showing 100 of 201 rows