Share your thoughts, 1 month free Claude Pro on usSee more

Mathematical Reasoning on AIME (Result and Process Tracking)

26.67Result Accuracy

StaRPO

Updated 3mo ago

Evaluation Results

Method	Links
StaRPO 2026.04		26.67	23.33
CPPO 2026.04		23.33	20
GRPO 2026.04		23.33	16.67
λ–GRPO 2026.04		16.67	13.33
CPPO 2026.04		13.33	10
Entropy 2026.04		13.33	13.33
GRPO 2026.04		10	10
StaRPO 2026.04		10	10
Original 2026.04		10	10
λ–GRPO 2026.04		6.67	6.67
Entropy 2026.04		6.67	6.67
Planner 2026.04		6.67	6.67
Original 2026.04		3.33	3.33
Planner 2026.04		0	0