Share your thoughts, 1 month free Claude Pro on usSee more

Reasoning on MATH (Accuracy, Latency, Speedup)

92.84Accuracy (%)

LR

Updated 2mo ago

Evaluation Results

Method	Links
LR 2026.03		92.84	9.56	1.2
Target Model 2026.03		91.54	1	1
Online-LR 2026.03		91.37	10.63	1.24
OSD-LR 2026.03		89.87	6.21	1.1
Draft Model 2026.03		60.66	1	3.54
Rubric-grounded GRPO 2026.05		52.88	-	-
Llama-3.1-8B-Instruct 2026.05		50.06	-	-