Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-modal Instruction Following on MM MTBench
Loading...
84.9
Overall Score
Ministral 3
31.236
45.168
59.1
73.032
Jan 13, 2026
Jan 26, 2026
Feb 9, 2026
Feb 23, 2026
Mar 9, 2026
Mar 23, 2026
Apr 6, 2026
Overall Score
Updated 11d ago
Evaluation Results
Method
Method
Links
Overall Score
Ministral 3
Model Size=14B
2026.01
84.9
Ministral 3
Model Size=8B
2026.01
80.8
Vero Q3I-8B
RL=Qwen3VL, Initial Mo...
2026.04
80.3
Qwen3-VL
Model Size=4B, Variant...
2026.01
80.08
Qwen3-VL
Model Size=8B, Variant...
2026.01
80
MiMoVL 7B-RL
RL=MiMoVL, Initial Mod...
2026.04
79.2
Ministral 3
Model Size=3B
2026.01
78.3
Q3VL 8B-Thk
RL=N/A, Initial Model=N/A
2026.04
77.8
Vero Mi-7B
RL=MiMoVL, Initial Mod...
2026.04
74.7
Q3VL 8B-Ins
RL=N/A, Initial Model=N/A
2026.04
74.4
Vero Q3T-8B
RL=Qwen3VL, Initial Mo...
2026.04
74.3
GPT-5 Nano
RL=N/A, Initial Model=N/A
2026.04
72.7
Gemma3
Model Size=12B, Varian...
2026.01
67
Qwen3-VL
Model Size=2B, Variant...
2026.01
63.6
Vero Q25-7B
RL=Qwen25VL, Initial M...
2026.04
62.8
Q25VL 7B-Ins
RL=N/A, Initial Model=N/A
2026.04
58.9
Gemma3
Model Size=4B, Variant...
2026.01
52.3
Mo2-O 7B
RL=SFT Only, Initial M...
2026.04
33.3
Feedback
Search any
task
Search any
task