Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multilingual Reasoning on MMLU-ProX (1k Stratified Subset Test)

18.7Accuracy

Task Arithmetic

16.30816.92917.5518.171Feb 9, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
18.746.4
2026.02
18.546.4
2026.02
18.247.8
2026.02
17.615.7
2026.02
1748.4
2026.02
16.447.2