Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-discipline Understanding on MMLU
Loading...
75.05
Accuracy
POP
20.866
34.933
49
63.067
Feb 3, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
POP
Backbone=Qwen3-VL-8B-I...
2026.02
75.05
Full Model
Backbone=Qwen3-VL-8B-I...
2026.02
74.95
Wanda
Backbone=Qwen3-VL-8B-I...
2026.02
73.78
Full Model
Backbone=Gemma-3-12B-I...
2026.02
71.46
POP
Backbone=Gemma-3-12B-I...
2026.02
71.37
Wanda
Backbone=Gemma-3-12B-I...
2026.02
69.7
Full Model
Backbone=Llama-3.1-8B-...
2026.02
68.33
POP
Backbone=Llama-3.1-8B-...
2026.02
67.43
Wanda
Backbone=Llama-3.1-8B-...
2026.02
65.87
ShortGPT
Backbone=Llama-3.1-8B-...
2026.02
65.8
SliceGPT
Backbone=Qwen3-VL-8B-I...
2026.02
39.16
SliceGPT
Backbone=Llama-3.1-8B-...
2026.02
34.97
ShortGPT
Backbone=Qwen3-VL-8B-I...
2026.02
33.85
ShortGPT
Backbone=Gemma-3-12B-I...
2026.02
23.81
SliceGPT
Backbone=Gemma-3-12B-I...
2026.02
22.95
Feedback
Search any
task
Search any
task