Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on Math500 (Accuracy, Acc Drop)

94.81Accuracy

FACT-E(Standard)

-3.022822.376147.77573.1739Feb 11, 2026Feb 21, 2026Mar 3, 2026Mar 13, 2026Mar 23, 2026Apr 2, 2026Apr 12, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.04
94.81-
2026.04
94.8-
2026.04
94.26-
2026.04
93.17-
2026.04
93.17-
2026.04
92.9-
2026.04
92.9-
2026.04
92.9-
2026.04
92.9-
2026.04
92.08-
2026.04
90.32-
2026.04
88.3-
2026.04
88.3-
87.6-
2026.04
87.43-
2026.04
87.1-
2026.04
86.61-
2026.04
85.52-
2026.04
85.52-
2026.04
84.7-
2026.04
83.33-
2026.04
83.06-
2026.04
82.79-
2026.04
80.6-
2026.04
78.69-
2026.03
78.1-
2026.03
77.84-
2026.03
77.84-
2026.03
77.08-
2026.03
76.21-
2026.04
76.2-
2026.03
75.91-
2026.04
75.3-
74.8-
2026.04
72.9-
2026.04
54.09-
2026.04
52.18-
2026.04
52.18-
2026.04
51.09-
2026.04
48.09-
2026.04
42.62-
2026.04
41-
2026.02
0.848-
2026.02
0.826-2.2
2026.02
0.8025.2
2026.02
0.786-6.2
2026.02
0.7823.2
2026.02
0.782-6.6
2026.02
0.75-
2026.02
0.74-1