Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multitask Language Understanding on MMMLU (non-EU languages, test)
Loading...
77.4
Accuracy
Qwen-3-30B-A3B
45.16
53.53
61.9
70.27
Feb 5, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Qwen-3-30B-A3B
Release Type=Open-weig...
2026.02
77.4
Qwen-3-32B
Release Type=Open-weig...
2026.02
77
Llama-3.3-70B
Release Type=Open-weig...
2026.02
75.5
Mistral-3.2-24B
Release Type=Open-weig...
2026.02
72.5
Qwen-3-14B
Release Type=Open-weig...
2026.02
72.4
Gemma-3-27B
Release Type=Open-weig...
2026.02
72.1
Gemma-3-12B
Release Type=Open-weig...
2026.02
66.6
OLMo-3.1-32B
Release Type=Fully-ope...
2026.02
63
EuroLLM-22B (new)
Release Type=Fully-ope...
2026.02
61.1
EuroLLM-9B (new)
Release Type=Fully-ope...
2026.02
57.7
Apertus-70B
Release Type=Fully-ope...
2026.02
57.6
EuroLLM-22B (old)
Release Type=Fully-ope...
2026.02
56.6
EuroLLM-9B (old)
Release Type=Fully-ope...
2026.02
51.3
Apertus-8B
Release Type=Fully-ope...
2026.02
50.9
Llama-3.1-8B
Release Type=Open-weig...
2026.02
50.2
OLMo-3-7B
Release Type=Fully-ope...
2026.02
46.4
Feedback
Search any
task
Search any
task