Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on AIME 2024 (pass@16)
Loading...
89.8
Pass@16
Ministral 3
72.224
76.787
81.35
85.913
Jan 13, 2026
Pass@16
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@16
Ministral 3
Model Size=14B
2026.01
89.8
Qwen3-VL
Model Size=8B
2026.01
86
Ministral 3
Model Size=8B
2026.01
86
Qwen 3
Model Size=14B
2026.01
83.7
Ministral 3
Model Size=3B
2026.01
77.5
Qwen3-VL
Model Size=4B
2026.01
72.9
Feedback
Search any
task
Search any
task