Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on AIME25 (Average Score)
Loading...
0.84
Average Score
Tool-Star
-0.0336
0.1932
0.42
0.6468
Jan 30, 2026
Average Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Average Score
Tool-Star
Category=SFT-RL TIR Me...
2026.01
0.84
Tool-Star-SFT
Category=SFT-only TIR...
2026.01
0.82
ToRL
Category=RL-only TIR M...
2026.01
0.79
Vanilla SFT-RL TIR
Category=SFT-RL TIR Me...
2026.01
0.78
AutoTraj
Category=SFT-RL TIR Me...
2026.01
0.76
AutoTIR
Category=RL-only TIR M...
2026.01
0.67
ReSearch
Category=RL-only TIR M...
2026.01
0.37
Qwen2.5-7B-Instruct
Framework=Multi-Dimens...
2026.01
0
R1-Searcher
Category=RL-only TIR M...
2026.01
0
Feedback
Search any
task
Search any
task