Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on AIME 2024 (Pass@1, Pass@16, Token Metrics)

13.5Mean@16

OptPO-SFT

3.4126.0318.6511.269Dec 2, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.12
13.55020-0.61
2025.12
13.146.720--
2025.12
6.236.73.3-15.33
2025.12
3.826.76.7--