Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MATH

Benchmarks

Task NameDataset NameSOTA ResultTrend
Mathematical ReasoningMATH
Accuracy95.63
882
Mathematical ReasoningMATH
Accuracy94.2
535
Mathematical ReasoningMATH500 (test)
Accuracy97.3
514
Mathematical ReasoningMATH 500
Accuracy98.4
442
Mathematical ReasoningMATH (test)
Overall Accuracy94.13
433
Mathematical ReasoningMATH
Accuracy96.67
338
Mathematical ReasoningMATH 500
pass@197.3
239
Mathematical Problem SolvingMATH
Accuracy97.6
229
Mathematical ReasoningMATH (test)
Pass@194.8
151
Mathematical ReasoningMATH 500
Accuracy (Acc)97.8
149
Mathematical ReasoningMath500
Accuracy (ACC)88.2
133
Math ReasoningMATH
Accuracy75.5
121
Mathematical ReasoningMATH-500
Accuracy97.6
119
Mathematical ReasoningMATH 500
Top-1 Accuracy90
112
Mathematical ReasoningMATH
Pass@192.71
112
Mathematical ReasoningMATH500 (full)
Accuracy92.4
111
Mathematical ReasoningMATH 500
MATH 500 Accuracy97.9
106
MathematicsMATH 500
Pass@197.3
95
Mathematical ReasoningMATH L5
Accuracy0.859
90
ReasoningMATH 500
Accuracy (%)100
90
MathMATH-500
Accuracy99.2
86
MathematicsMATH
MATH Accuracy95.1
85
Mathematical ReasoningMATH500
Accuracy96.2
82
Mathematical ReasoningMATH 500
Pass@1 Rate88.67
76
Mathematical ReasoningMATH 500
Accuracy91.2
73
Showing 25 of 541 rows
...