Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on Macro Average Selected Benchmarks
Loading...
52.8
Pass@1 (Avg@32)
Mix-RL
42.192
44.946
47.7
50.454
Feb 3, 2026
Pass@1 (Avg@32)
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1 (Avg@32)
Mix-RL
Model Backbone=Qwen2.5...
2026.02
52.8
Fold-RL
Model Backbone=Qwen2.5...
2026.02
52.7
Unfold-RL
Model Backbone=Qwen2.5...
2026.02
52.2
Fold-RL
Model Backbone=Qwen3-4...
2026.02
52.1
Mix-RL
Model Backbone=Qwen3-4...
2026.02
52.1
Unfold-RL
Model Backbone=Qwen3-4...
2026.02
52
Unfold-RL
Model Backbone=Qwen2.5...
2026.02
50.3
Unfold-RL
Model Backbone=Qwen3-4...
2026.02
49.2
Cold-Start
Model Backbone=Qwen2.5...
2026.02
48.5
Zero-RL
Model Backbone=Qwen3-4...
2026.02
47.6
Cold-Start
Model Backbone=Qwen3-4...
2026.02
47.5
Cold-Start
Model Backbone=Qwen2.5...
2026.02
45.7
Zero-RL
Model Backbone=Qwen2.5...
2026.02
44.6
Cold-Start
Model Backbone=Qwen3-4...
2026.02
42.6
Feedback
Search any
task
Search any
task