Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on AIME 25 (avg@8)
Loading...
53.8
Avg@8 Score
NPR (Variant)
8.664
20.382
32.1
43.818
Dec 8, 2025
Avg@8 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg@8 Score
NPR (Variant)
Data=orz-8k, Train=Par...
2025.12
53.8
NPR
Data=orz-8k, Train=Par...
2025.12
50.4
SR
Data=orz-8k, Train=Seq...
2025.12
49.2
Qwen3-4B-Instruct-2507
Data=N/A, Train=N/A, B...
2025.12
47.4
Multiverse-32B
Data=s1.1-8k, Train=S→...
2025.12
45.8
NPR-BETA (Variant)
Data=orz-8k, Train=Par...
2025.12
43.8
Multiverse-4B
Data=s1.1-8k, Train=S→...
2025.12
42.9
NPR-BETA
Data=orz-8k, Train=Par...
2025.12
42.9
SR-BETA
Data=orz-8k, Train=Seq...
2025.12
37.1
Qwen3-4B (Non-Thinking)
Data=N/A, Train=N/A, B...
2025.12
19.1
Qwen2.5-32B-Instruct
Data=N/A, Train=N/A, B...
2025.12
10.4
Feedback
Search any
task
Search any
task