Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on MATH 500 (Avg@3)
Loading...
76.2
Avg@3
ADORA
2.568
21.684
40.8
59.916
Feb 10, 2026
Avg@3
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg@3
ADORA
Backbone=Qwen2.5-7B, T...
2026.02
76.2
GRPO
Backbone=Qwen2.5-7B, S...
2026.02
73.2
Qwen2.5-7B
Training Method=Base,...
2026.02
57.2
ADORA
Backbone=DeepSeek-Math...
2026.02
41.8
GRPO
Backbone=DeepSeek-Math...
2026.02
39.5
ADORA
Backbone=Llama-3.1-8B,...
2026.02
39.4
GRPO
Backbone=Llama-3.1-8B,...
2026.02
33.8
ADORA
Backbone=Mistral-v0.1-...
2026.02
30.4
GRPO
Backbone=Mistral-v0.1-...
2026.02
26.8
DeepSeek-Math-7B
Training Method=Base,...
2026.02
19.6
Llama-3.1-8B
Training Method=Base,...
2026.02
12.7
Mistral-v0.1-7B
Training Method=Base,...
2026.02
5.4
Feedback
Search any
task
Search any
task