Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Language Understanding on MMLU-Redux

0.3762Base Score

Instruct Model (Q1)

0.0874960.1624480.23740.312352Jan 27, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
0.376254.580.3742-0.1696
2026.01
0.357154.580.3485-0.1887
2026.01
0.247570.260.2461-0.4552
2026.01
0.2365-0.0184-0.3093
2026.01
0.2356-0.0546-0.3102
2026.01
0.23270.440.2097-0.4724
2026.01
0.2294-0.0178-0.3164
2026.01
0.200553.160.0277-0.331
2026.01
0.1879-0.0389-0.5147
2026.01
0.1744-0.0591-0.5282
2026.01
0.171464.460.019-0.4732
2026.01
0.1696-0.0331-0.533
2026.01
0.165380.260.161-0.6373
2026.01
0.149880.260.0965-0.6528
2026.01
0.1403-0.0287-0.6623
2026.01
0.137973.540.0279-0.5976
2026.01
0.1299-0.0563-0.6727
2026.01
0.1239-0.0117-0.6788
2026.01
0.117785.750.1071-0.7399
2026.01
0.110185.750.0612-0.7474
2026.01
0.1086-0.0303-0.749
2026.01
0.108481.120.0152-0.7028
2026.01
0.1011-0.0679-0.7565
2026.01
0.0986-0.0411-0.7589