Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on Overall

80.17Accuracy

SGPO

10.521228.603146.68564.7669May 16, 2025Jun 28, 2025Aug 10, 2025Sep 23, 2025Nov 5, 2025Dec 18, 2025Jan 31, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2025.05
80.17---
2025.05
79.15---
2025.05
78.84---
2025.05
66.19---
2025.05
65.42---
2025.05
65.41---
2025.05
65.08---
2025.05
64.91---
2026.01
64.5--8,499
2025.05
63.72---
2025.05
62.85---
2025.05
53.81---
2025.05
53.06---
2025.05
51.58---
2026.01
46.9--13,506
2026.01
45.6--9,852
2025.05
45.06---
2025.06
45--3,696
2025.05
44.39---
2025.06
44--6,183
2025.05
43.74---
2025.06
43.1--1,334
2025.05
41---
2025.05
40.11---
2025.05
39.85---
2025.06
37--1,949
2025.06
36.5--2,437
2025.06
34.2--3,230
2025.06
34.2--1,653
2025.06
32.6--6,870
2025.06
27.1--1,118
2025.06
20.5--975
2025.06
17.3--1,000
2025.06
15.5--1,712
2025.06
14.6--767
2025.06
13.2--1,922
2026.01
-51.773.29,388
2026.01
-65.8807,709
2026.01
-4566.59,195