Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on Minerva (Avg@2)
Loading...
57.7
Avg@2
DARL
31.7
38.45
45.2
51.95
Jan 21, 2026
Avg@2
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg@2
DARL
Base=Base, Verifier=None
2026.01
57.7
RLPR
Base=Base, Verifier=None
2026.01
56.5
RLVR
Base=Base, Verifier=Rule
2026.01
54.9
TTRL
Base=Base, Verifier=Rule
2026.01
52.8
Oat-Zero
Base=Math, Verifier=Rule
2026.01
52.1
General Reasoner
Base=Base, Verifier=Model
2026.01
51.7
SimpleRL-Zoo
Base=Math, Verifier=Rule
2026.01
51
Qwen2.5-7B-Inst
Base=-, Verifier=None
2026.01
49.4
SimpleRL-Zoo
Base=Base, Verifier=Rule
2026.01
49.2
VeriFree
Base=Base, Verifier=None
2026.01
49
PRIME
Base=Math, Verifier=Rule
2026.01
45.5
RLPR
Base=Inst, Verifier=None
2026.01
39
DARL
Base=Inst, Verifier=None
2026.01
37.9
Qwen2.5-7B
Base=-, Verifier=None
2026.01
37.6
RLVR
Base=Inst, Verifier=Rule
2026.01
35.2
Llama3.1-8B-Inst
Base=-, Verifier=None
2026.01
32.7
Feedback
Search any
task
Search any
task