Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on AIME 24 (avg@8)
Loading...
63.3
Avg@8
NPR
13.9
26.725
39.55
52.375
Dec 8, 2025
Avg@8
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg@8
NPR
Data=orz-8k, Train=Par...
2025.12
63.3
NPR (Variant)
Data=orz-8k, Train=Par...
2025.12
62.5
Qwen3-4B-Instruct-2507
Data=N/A, Train=N/A, B...
2025.12
60
SR
Data=orz-8k, Train=Seq...
2025.12
57.1
Multiverse-32B
Data=s1.1-8k, Train=S→...
2025.12
53.8
NPR-BETA (Variant)
Data=orz-8k, Train=Par...
2025.12
52.5
SR-BETA
Data=orz-8k, Train=Seq...
2025.12
52.1
NPR-BETA
Data=orz-8k, Train=Par...
2025.12
50.8
Multiverse-4B
Data=s1.1-8k, Train=S→...
2025.12
46.7
Qwen3-4B (Non-Thinking)
Data=N/A, Train=N/A, B...
2025.12
25
Qwen2.5-32B-Instruct
Data=N/A, Train=N/A, B...
2025.12
15.8
Feedback
Search any
task
Search any
task