Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on Mathematics (Pass@1)
Loading...
65.8
Pass@1
LEMMA (w/ MetaMath)
21.808
33.229
44.65
56.071
Mar 21, 2025
Pass@1
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
LEMMA (w/ MetaMath)
Backbone=DeepSeekMath-...
2025.03
65.8
LEMMA
Backbone=DeepSeekMath-...
2025.03
61.6
RefAug-90k
Backbone=DeepSeekMath-...
2025.03
57.5
RefAug
Backbone=DeepSeekMath-...
2025.03
56.9
GPTAug
Backbone=DeepSeekMath-...
2025.03
52.6
MetaMath
Backbone=DeepSeekMath-...
2025.03
49
RFT
Backbone=DeepSeekMath-...
2025.03
46.2
LEMMA (w/ MetaMath)
Backbone=LLaMA3-8B, #...
2025.03
45.8
ISC
Backbone=DeepSeekMath-...
2025.03
43.1
SFT
Backbone=DeepSeekMath-...
2025.03
39.6
LEMMA
Backbone=LLaMA3-8B, #...
2025.03
39.2
GPTAug
Backbone=LLaMA3-8B, #...
2025.03
36.5
RefAug-90k
Backbone=LLaMA3-8B, #...
2025.03
35.7
RefAug
Backbone=LLaMA3-8B, #...
2025.03
35.5
MetaMath
Backbone=LLaMA3-8B, #...
2025.03
35.3
ISC
Backbone=LLaMA3-8B, #...
2025.03
31.8
RFT
Backbone=LLaMA3-8B, #...
2025.03
24.9
SFT
Backbone=LLaMA3-8B, #...
2025.03
23.5
Feedback
Search any
task
Search any
task