Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on AIME 2024 (Accuracy, Length, and Length-Accuracy Metrics)

36.7Accuracy

Effi. Reasoning

26.32931.734.4Jun 9, 2025
Updated 12d ago

Evaluation Results

MethodLinks
2025.06
36.75,77119.9
2025.06
36.75,40021.4
2025.06
33.35,15920.3
2025.06
33.32,94326.7
2025.06
33.32,94326.7
2025.06
306,18314.9
2025.06
26.76,96110.3
2025.06
26.75,95813.9