Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MATH

Benchmarks

Task NameDataset NameSOTA ResultTrend
Mathematical ReasoningMATH
Accuracy95.63
643
Mathematical ReasoningMATH
Accuracy94.2
535
Mathematical ReasoningMATH (test)
Overall Accuracy94.13
433
Mathematical ReasoningMATH500 (test)
Accuracy97.3
381
Mathematical Problem SolvingMATH
Accuracy97.6
166
Mathematical ReasoningMATH
Accuracy96.67
162
Mathematical ReasoningMATH 500
Accuracy92
155
Mathematical ReasoningMATH 500
pass@196.9
153
Mathematical ReasoningMATH (test)
Pass@194.8
151
Mathematical ReasoningMath500
Accuracy (ACC)88.2
133
Mathematical ReasoningMATH-500
Accuracy97.6
119
Mathematical ReasoningMATH
Pass@192.71
112
Mathematical ReasoningMATH500 (full)
Accuracy92.4
111
Mathematical ReasoningMATH 500
MATH 500 Accuracy97.9
106
Math ReasoningMATH
Accuracy75.5
88
Mathematical ReasoningMATH L5
Accuracy0.859
86
Mathematical ReasoningMATH 500
Accuracy91.2
73
Hallucination DetectionMath
Mean AUROC81.57
72
ReasoningMATH 500
Accuracy (%)100
59
Mathematical ReasoningMATH500
Accuracy92.1
57
Mathematical ReasoningMATH
Score96.7
50
Mathematical ReasoningMATH-500
Token Savings84.1
48
Mathematical ReasoningMATH
Accuracy38.36
48
Mathematical ReasoningMATH500
Accuracy90
45
Math ReasoningMATH500
Accuracy93.2
41
Showing 25 of 325 rows
...