Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

General Reasoning on MMLU (Accuracy and Token Cost)

81.5Accuracy

Denser

1.201622.048342.89563.7417Apr 15, 2025Jun 3, 2025Jul 22, 2025Sep 9, 2025Oct 28, 2025Dec 16, 2025Feb 4, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2025.12
81.5-58.7----
2025.12
80.4142.8----
2025.12
80.1156.3----
2025.12
79.5127.6----
2025.12
79.4-53.5----
2025.12
79.1187.2----
2025.12
78.5289.4----
2025.12
78.2345.7----
2025.12
77.30----
2025.12
76.4-54.2----
2025.12
75.9-50.8----
2025.04
74.94-----
2025.04
74.85-----
2025.04
74.7-----
2025.04
74.46-----
2025.04
74.26-----
2025.04
74.24-----
2025.04
74.2-----
2025.04
73.2-----
2025.04
73.18-----
2025.04
72.77-----
2026.02
70.41-----
2026.02
63.3-----
2026.02
24.48-----
2026.02
4.29-----
2026.03
--38.8347.2252.3240.95
2026.03
--39.1747.1251.1238.69
2026.03
--38.5545.1949.7636.95
2026.03
--39.2147.3452.8441.8
2026.03
--48.6360.4164.0953.47
2026.03
--48.6360.3264.3853.35
2026.03
--48.860.464.4154.14
2026.03
--4960.836554.61
2026.03
--59.1371.0777.8768.25
2026.03
--59.62717869.01
2026.03
--59.3671.1278.0368.79
2026.03
--59.671.2277.9368.63
2026.03
--59.771.1378.169.14