Share your thoughts, 1 month free Claude Pro on usSee more

Mathematical Reasoning on Macro Average Selected Benchmarks

52.8Pass@1 (Avg@32)

Mix-RL

Updated 5mo ago

Evaluation Results

Method	Links
Mix-RL 2026.02		52.8
Fold-RL 2026.02		52.7
Unfold-RL 2026.02		52.2
Fold-RL 2026.02		52.1
Mix-RL 2026.02		52.1
Unfold-RL 2026.02		52
Unfold-RL 2026.02		50.3
Unfold-RL 2026.02		49.2
Cold-Start 2026.02		48.5
Zero-RL 2026.02		47.6
Cold-Start 2026.02		47.5
Cold-Start 2026.02		45.7
Zero-RL 2026.02		44.6
Cold-Start 2026.02		42.6