Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on DeepMath500

69Pass@1 Rate

Baseline LLM

41.64848.74955.8562.951Mar 4, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
6914,882100
2026.03
67.513,88259.9
2026.03
66.315,25472.5
2026.03
63.39,56241.7
2026.03
61.47,65935.6
2026.03
60.311,59443.6
2026.03
57.62,5110
2026.03
57.62,6721
2026.03
57.62,5110
2026.03
56.63,7660
2026.03
55.73,4900
2026.03
42.74,4230