Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on AIME (Result and Process Tracking)
Loading...
26.67
Result Accuracy
StaRPO
-1.0668
6.1341
13.335
20.5359
Apr 10, 2026
Result Accuracy
Process Accuracy
Updated 6d ago
Evaluation Results
Method
Method
Links
Result Accuracy
Process Accuracy
StaRPO
Model=Qwen 7B
2026.04
26.67
23.33
CPPO
Model=Qwen 7B
2026.04
23.33
20
GRPO
Model=Qwen 7B
2026.04
23.33
16.67
λ–GRPO
Model=Qwen 7B
2026.04
16.67
13.33
CPPO
Model=Qwen 1.5B
2026.04
13.33
10
Entropy
Model=Qwen 7B
2026.04
13.33
13.33
GRPO
Model=Qwen 1.5B
2026.04
10
10
StaRPO
Model=Qwen 1.5B
2026.04
10
10
Original
Model=Qwen 7B
2026.04
10
10
λ–GRPO
Model=Qwen 1.5B
2026.04
6.67
6.67
Entropy
Model=Qwen 1.5B
2026.04
6.67
6.67
Planner
Model=Qwen 7B
2026.04
6.67
6.67
Original
Model=Qwen 1.5B
2026.04
3.33
3.33
Planner
Model=Qwen 1.5B
2026.04
0
0
Feedback
Search any
task
Search any
task