Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Math Reasoning on AIME 2022–2024

9.27Accuracy

GRPO + WeMask (TF)

5.7866.69057.5958.4995May 8, 2026
Updated 22d ago

Evaluation Results

MethodLinks
2026.05
9.27
2026.05
7.8
2026.05
7.4
2026.05
5.92