Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mathematical Problem Solving on MATH

97.6Accuracy

DeepSeek-R1

32.28849.24466.283.156Jun 18, 2024Sep 26, 2024Jan 5, 2025Apr 15, 2025Jul 25, 2025Nov 2, 2025Feb 11, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
97.6
2026.01
96.5
2026.01
95.2
2026.01
95
2026.01
94.5
2026.01
93.2
2026.01
92.6
2026.01
90.7
2026.01
88.2
2025.05
84
2025.12
83.28
2026.01
83.2
2025.02
83.1
2025.02
83
2026.01
79.8
2026.02
79.8
2026.02
79.7
2026.02
79.3
2025.05
78.86
2026.02
78.6
2026.02
78.2
2025.05
77.63
2026.02
77.2
2026.02
75.9
2026.02
75.6
2026.02
74.7
2026.02
74.6
2026.02
73.9
2025.02
73.8
2026.02
72.8
2026.02
71.3
2026.02
69.6
2025.02
69
2026.02
68.8
2025.02
68
2026.02
68
2026.02
66.8
2025.10
64.2
2026.02
63.7
2026.02
63.4
2026.02
62.6
2026.02
62.3
2026.02
61.9
2026.02
61.4
2026.02
61.4
2026.02
61
2026.02
60.8
2026.02
60.8
2026.02
58
2025.04
56.3
2025.04
56.26
2025.04
56
2025.04
55.8
2025.04
54.96
2025.04
54.46
2025.12
54.22
2024.06
53.6
2024.06
53.1
2024.06
52.9
2026.01
52.7
2026.01
52.6
2026.02
52.57
2026.01
52.4
2026.01
51.9
2026.01
51.7
2026.01
51.2
2025.10
51.2
2026.02
51
2026.01
51
2026.02
50.7
2026.01
50.6
2025.10
50.4
2026.01
50.2
2026.01
50
2026.01
49.2
2025.04
48.6
2026.01
48.2
2025.04
47.84
2025.12
46.6
2025.10
45.8
2025.10
45.4
2026.02
44.2
2025.10
43.4
2025.04
42.82
2026.01
41
2026.01
40.7
2026.02
40
2026.01
40
2026.01
39.6
2026.01
39.2
2025.12
38.78
2025.12
38.58
2026.02
37.9
2026.02
37.7
2026.02
37.6
2025.12
36.76
2026.02
36.6
2026.01
36.6
2026.02
35.4
2026.02
34.8
Showing 100 of 166 rows