Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Knowledge Evaluation on MMLU-Redux 2.0 (Continual)

33.49Accuracy

STOC

23.797226.313628.8331.3464May 11, 2026
Updated 22d ago

Evaluation Results

MethodLinks
2026.05
33.49
2026.05
32.93
2026.05
31.71
2026.05
28.05
2026.05
24.39
2026.05
24.17