Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mathematical Reasoning on AIME 2024 (Pass@16, Mean@16, Token Metrics)

24.4Mean Score @16

TTRL-PPO

0.5846.76712.9519.133Dec 2, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
24.446.723.3--
2025.12
2443.326.7-42.85
2025.12
23.54023.3--
2025.12
23.543.323.3-38.56
2025.12
20.84023.3--
2025.12
20.64023.3-41.45
2025.12
18.546.723.3--
2025.12
1646.723.3-15.2
2025.12
15.643.320--
2025.12
14.646.720-17.25
2025.12
11.943.320--
2025.12
10.446.720-14.35
2025.12
3.8306.7--
2025.12
3.8306.7-13.7
2025.12
3.523.36.7--
2025.12
1.7206.7-47.62
2025.12
1.723.36.7-30.02
2025.12
1.516.76.7--