Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on AIME 2025 (Avg@10 and Pass@10)
Loading...
50
Avg@10
Qwen3-4B RL finetuned on HanabiRewards
48.648
48.999
49.35
49.701
Jan 26, 2026
Avg@10
Pass@10
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg@10
Pass@10
Qwen3-4B RL finetuned on HanabiRewards
Backbone=Qwen3-4B, Var...
2026.01
50
73.3
Qwen3-4B-Instruct-2507
Backbone=Qwen3-4B, Var...
2026.01
48.7
73.3
Feedback
Search any
task
Search any
task