Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on AIME 25 (Avg@6)
Loading...
70.6
Avg@6
Qwen3-235B
34.2
43.65
53.1
62.55
Jan 30, 2026
Avg@6
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg@6
Qwen3-235B
Reasoning protocol=non...
2026.01
70.6
SYNTHAGENT-14B
Reasoning protocol=non...
2026.01
66.7
ToolStar-14B
Reasoning protocol=non...
2026.01
63.3
SYNTHAGENT-8B
Reasoning protocol=non...
2026.01
58.9
ToolStar-8B
Reasoning protocol=non...
2026.01
54.4
Qwen3-32B
Reasoning protocol=non...
2026.01
41.1
Qwen3-14B
Reasoning protocol=non...
2026.01
35.6
Feedback
Search any
task
Search any
task