Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

STEM Reasoning on MMLU STEM

73.7Accuracy (STEM)

TaH+

41.35649.75358.1566.547Nov 11, 2025Dec 12, 2025Jan 13, 2026Feb 14, 2026Mar 17, 2026Apr 18, 2026May 20, 2026
Updated 13d ago

Evaluation Results

MethodLinks
2025.11
73.7
2025.11
70.8
2025.11
70.6
2025.11
63.8
2026.05
58.1
2026.05
57.9
2026.05
57.6
2026.05
57.1
2025.11
56.3
56
2025.11
51.6
2025.11
51.4
2026.05
48.8
2026.05
46.9
2025.11
42.6