Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on Minerva-Math (avg@1)
Loading...
47.1
Avg@1 Accuracy
NPR (Variant)
26.4144
31.7847
37.155
42.5253
Dec 3, 2025
Dec 4, 2025
Dec 5, 2025
Dec 6, 2025
Dec 7, 2025
Dec 8, 2025
Avg@1 Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg@1 Accuracy
NPR (Variant)
Data=orz-8k, Train=Par...
2025.12
47.1
NPR-BETA (Variant)
Data=orz-8k, Train=Par...
2025.12
45.9
NPR
Data=orz-8k, Train=Par...
2025.12
43
SR-BETA
Data=orz-8k, Train=Seq...
2025.12
41.5
Qwen3-4B-Instruct-2507
Data=N/A, Train=N/A, B...
2025.12
41.2
NPR-BETA
Data=orz-8k, Train=Par...
2025.12
41.2
Qwen2.5-32B-Instruct
Data=N/A, Train=N/A, B...
2025.12
40.8
Multiverse-32B
Data=s1.1-8k, Train=S→...
2025.12
40
SR
Data=orz-8k, Train=Seq...
2025.12
38.2
Multiverse-4B
Data=s1.1-8k, Train=S→...
2025.12
34.9
Robust Bellman
Training Domain=Math D...
2025.12
31.99
DVPO
Training Domain=Math D...
2025.12
31.62
PPO
Training Domain=Math D...
2025.12
30.51
Base
Training Domain=Math D...
2025.12
28.68
GRPO
Training Domain=Math D...
2025.12
28.68
Qwen3-4B (Non-Thinking)
Data=N/A, Train=N/A, B...
2025.12
28.5
Dr.GRPO
Training Domain=Math D...
2025.12
27.94
Reinforce++
Training Domain=Math D...
2025.12
27.21
Feedback
Search any
task
Search any
task