Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on Math500 (ACC, Avg Tokens)

97Accuracy (%)

Base Model

40.8455.427084.58Aug 5, 2025Sep 10, 2025Oct 16, 2025Nov 21, 2025Dec 27, 2025Feb 1, 2026Mar 9, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
97-6,680
2026.03
96.6-3,488
2026.03
93.1-1,903
2026.03
93-2,753
2026.03
92.3-3,928
2025.08
91.377,138.49-
2025.08
91.135,209.24-
2026.03
88.5-1,346
2025.08
88.173,703.83-
2025.08
88.172,604.23-
2026.03
85.1-2,720
2026.03
84.9-5,420
2026.03
84.5-2,645
2025.08
84.372,853.73-
2026.03
84.1-2,744
2025.08
82.161,980.97-
2026.03
75.2-3,421
2026.03
72.5-2,804
2026.03
72-3,971
2026.03
71.4-2,950
2026.03
71-1,008
2026.03
44.3-3,287
2026.03
43-4,829