Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Knowledge Evaluation on MMLU-Redux

0.1083Brier Score

Isotonic Regression

0.1028320.1397410.176650.213559Jan 27, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
0.1083-0.0186-0.7245
2026.01
0.1116-0.0584-0.7212
2026.01
0.1198-0.0177-0.713
2026.01
0.122977.350.0326-0.6506
2026.01
0.123283.280.0659-0.7096
2026.01
0.127875.630.0534-0.6285
2026.01
0.133783.280.125-0.6991
2026.01
0.1494-0.0528-0.6427
2026.01
0.1509-0.0465-0.6412
2026.01
0.151980.520.0664-0.6534
2026.01
0.1571-0.0546-0.635
2026.01
0.15865.420.0176-0.4962
2026.01
0.17279.210.1628-0.6201
2026.01
0.1807-0.0185-0.5366
2026.01
0.1874-0.0352-0.5299
2026.01
0.1889-0.0188-0.5285
2026.01
0.215371.740.1679-0.5022
2026.01
0.24571.740.2417-0.4724