Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multiple Choice Question Answering on MMLU-Redux (test)

0.1116BS

TS (Supervised)

0.1062640.1422820.17830.214318Jan 27, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
0.1116-0.0584-0.7212
2026.01
0.122977.350.0326-0.6506
2026.01
0.123283.280.0659-0.7096
2026.01
0.127875.630.0534-0.6285
2026.01
0.133783.280.125-0.6991
2026.01
0.151980.520.0664-0.6534
2026.01
0.1571-0.0546-0.635
2026.01
0.15865.420.0176-0.4962
2026.01
0.17279.210.1628-0.6201
2026.01
0.1874-0.0352-0.5299
2026.01
0.215371.740.1679-0.5022
2026.01
0.24571.740.2417-0.4724