Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on MATH-500 (Accuracy, Avg., Drop ↓)

95.6Accuracy

SC@16

-3.82421.98847.873.612Jan 21, 2026Jan 27, 2026Feb 3, 2026Feb 10, 2026Feb 17, 2026Feb 24, 2026Mar 3, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
95.6--
2026.03
95.6--
2026.01
95.3370.63-
2026.03
94.3--
2026.01
91.460.78-9.85
2026.01
89.9358.28-12.35
2026.01
84.448.72-
2026.03
80.8--
2026.03
80.2--
2026.01
7441.1-
2026.03
74--
2026.01
73.241.31-7.41
2026.01
64.838.39-10.33
2026.01
30.2721.44-19.66
2026.01
21.6717.34-23.76
2026.01
1.22.11-46.61
2026.01
04.84-36.26