Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LLM Performance Estimation on MMLU (test)

0.842MAE (%)

SparseEval

0.560082.463044.3666.26896Feb 8, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
0.8420.908
2026.02
0.9620.896
2026.02
0.9970.89
2026.02
1.2820.871
2026.02
1.7180.832
2026.02
1.9640.856
2026.02
2.0190.862
2026.02
2.2020.829
2026.02
2.2160.876
2026.02
2.3310.85
2026.02
2.4210.857
2026.02
2.5370.798
2026.02
2.6770.845
2026.02
2.830.801
2026.02
3.190.71
2026.02
3.8020.692
2026.02
4.0460.755
2026.02
4.8980.727
2026.02
5.940.569
2026.02
7.890.764