Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DeepMath

Benchmarks

Task NameDataset NameSOTA ResultTrend
Mathematical ReasoningDeepMath
Pass@170.5
44
Mathematical ReasoningDeepMath 2025 (test)
Pass@155.8
32
Mathematical ReasoningDeepMath
Accuracy46.36
30
Mathematical ReasoningDEEPMATH 128 samples
Top-1 Accuracy35.93
12
Mathematical ReasoningDeepMath500
Pass@1 Rate69
12
Mathematical ReasoningDeepMath (test)
Pass@162
12
Theorem ProvingDeepMath
FR (Fetch Rate)94
8
Mathematical ReasoningDeepMath 103K subset
Accuracy65.9
6
Mathematical and General ReasoningDeepMATH (test)
MATH 500 Score83.4
4
Showing 9 of 9 rows