Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on MathQA (Decoding Efficiency)

2,555Average Acceptance Length τ

GRPO

-101.1184588.45081,278.021,967.5892Mar 10, 2026Mar 11, 2026Mar 13, 2026Mar 14, 2026Mar 16, 2026Mar 17, 2026Mar 19, 2026
Updated 2mo ago

Evaluation Results

MethodLinks
2026.03
2,555-86.15
2026.03
2,313-86.46
2026.03
5.163.43-
2026.03
4.73.16-
2026.03
4.663.19-
2026.03
4.553-
2026.03
3.522.12-
2026.03
3.021.91-
2026.03
3.011.9-
2026.03
2.931.86-
2026.03
1.230.91-
2026.03
1.040.66-