Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Problem Solving on AIME'24 (test)
Loading...
26.67
Average@16 Accuracy
PRM-CoT (Process-Aware)
8.8964
13.5107
18.125
22.7393
Dec 2, 2025
Average@16 Accuracy
Pass@1 Accuracy
Pass@8 Accuracy
Pass@16 Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Average@16 Accuracy
Pass@1 Accuracy
Pass@8 Accuracy
Pass@16 Accuracy
PRM-CoT (Process-Aware)
Base Policy=Qwen2.5-Ma...
2025.12
26.67
30
40
50
RLVR (Ground Truth)
Base Policy=Qwen2.5-Ma...
2025.12
22.71
26.67
36.67
50
PRM (Process-Aware)
Base Policy=Qwen2.5-Ma...
2025.12
18.12
26.67
40
46.67
SFT (Baseline)
Base Policy=Qwen2.5-Ma...
2025.12
9.58
3.3
23.33
26.67
Feedback
Search any
task
Search any
task