Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Language Understanding on MMLU (Accuracy and Prompt/Response Scores)

82.1Accuracy

GPT-4o-mini

20.7436.6752.668.53Mar 11, 2026Mar 12, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
82.14.3884.7094.5
2026.03
80.34.4124.6054.35
2026.03
71.14.1134.4634.252
2026.03
68.74.1884.5774.302
2026.03
67.34.184.4314.153
2026.03
66.8---
2026.03
66.3---
2026.03
65.1---
2026.03
64.8---
2026.03
64.5---
2026.03
64.2---
2026.03
63.9---
2026.03
51.15---
2026.03
51.1---
2026.03
51.07---
2026.03
50.46---
2026.03
48.3---
2026.03
27.1---
2026.03
25.7---
2026.03
25.3---
2026.03
25.3---
2026.03
25.3---
2026.03
25.2---
2026.03
24.9---
2026.03
24.8---
2026.03
24.8---
2026.03
24.8---
2026.03
24.8---
2026.03
24.4---
2026.03
24.3---
2026.03
24.1---
2026.03
23.7---
2026.03
23.5---
2026.03
23.1---