Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on AMC23 (Avg@32 Accuracy)
Loading...
90.5
Avg@32 Accuracy
Dynamic Sampling (Oracle)
56.388
65.244
74.1
82.956
Feb 2, 2026
Avg@32 Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg@32 Accuracy
Dynamic Sampling (Oracle)
Backbone=DeepSeek-R1-D...
2026.02
90.5
GPS
Backbone=DeepSeek-R1-D...
2026.02
90.5
PCL
Backbone=DeepSeek-R1-D...
2026.02
90.1
GRESO
Backbone=DeepSeek-R1-D...
2026.02
89.7
Uniform Sampling
Backbone=DeepSeek-R1-D...
2026.02
89.5
MoPPS
Backbone=DeepSeek-R1-D...
2026.02
89.5
Dynamic Sampling (Oracle)
Backbone=DeepSeek-R1-D...
2026.02
79.2
GPS
Backbone=DeepSeek-R1-D...
2026.02
78.1
MoPPS
Backbone=DeepSeek-R1-D...
2026.02
77.8
PCL
Backbone=DeepSeek-R1-D...
2026.02
77.7
GRESO
Backbone=DeepSeek-R1-D...
2026.02
77.1
DeepSeek-R1-Distill-7B
Backbone=DeepSeek-R1-D...
2026.02
76.5
Uniform Sampling
Backbone=DeepSeek-R1-D...
2026.02
76.1
DeepSeek-R1-Distill-1.5B
Backbone=DeepSeek-R1-D...
2026.02
57.7
Feedback
Search any
task
Search any
task