Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mathematical Reasoning on MATH-500 (Accuracy, Avg., Drop ↓)

95.33Accuracy

BF16 Baseline

-3.813221.925947.66573.4041Jan 21, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
95.3370.63-
2026.01
91.460.78-9.85
2026.01
89.9358.28-12.35
2026.01
84.448.72-
2026.01
7441.1-
2026.01
73.241.31-7.41
2026.01
64.838.39-10.33
2026.01
30.2721.44-19.66
2026.01
21.6717.34-23.76
2026.01
1.22.11-46.61
2026.01
04.84-36.26