Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on AIME24 (Pass@1 Avg@32)
Loading...
32.2
Pass@1 Accuracy
Mix-RL
18.68
22.19
25.7
29.21
Feb 3, 2026
Pass@1 Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1 Accuracy
Mix-RL
Model Backbone=Qwen2.5...
2026.02
32.2
Unfold-RL
Model Backbone=Qwen2.5...
2026.02
32
Fold-RL
Model Backbone=Qwen2.5...
2026.02
31.3
Unfold-RL
Model Backbone=Qwen2.5...
2026.02
29.1
Fold-RL
Model Backbone=Qwen3-4...
2026.02
28.4
Mix-RL
Model Backbone=Qwen3-4...
2026.02
27.6
Unfold-RL
Model Backbone=Qwen3-4...
2026.02
27.5
Cold-Start
Model Backbone=Qwen2.5...
2026.02
26.7
Zero-RL
Model Backbone=Qwen2.5...
2026.02
25.8
Unfold-RL
Model Backbone=Qwen3-4...
2026.02
25.8
Zero-RL
Model Backbone=Qwen3-4...
2026.02
25.5
Cold-Start
Model Backbone=Qwen3-4...
2026.02
23.8
Cold-Start
Model Backbone=Qwen2.5...
2026.02
23
Cold-Start
Model Backbone=Qwen3-4...
2026.02
19.2
Feedback
Search any
task
Search any
task