Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-image understanding on QBench2
Loading...
81.7
Accuracy
DelimScaling
48.94
57.445
65.95
74.455
Feb 2, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
DelimScaling
Backbone=Qwen2.5-VL, M...
2026.02
81.7
Qwen2.5-VL
Model Size=32B, Method...
2026.02
81.4
DelimScaling
Backbone=InternVL3, Mo...
2026.02
80.1
InternVL3
Model Size=14B, Method...
2026.02
79.6
DelimScaling
Backbone=InternVL3, Mo...
2026.02
76.6
DelimScaling
Backbone=Qwen2.5-VL, M...
2026.02
76.5
InternVL3
Model Size=8B, Method...
2026.02
76.5
Qwen2.5-VL
Model Size=7B, Method...
2026.02
75.8
DelimScaling
Backbone=LLaVA-OV, Mod...
2026.02
74.2
LLaVA-OV
Model Size=7B, Method...
2026.02
73.9
DelimScaling
Backbone=InternVL3, Mo...
2026.02
65.6
InternVL3
Model Size=2B, Method...
2026.02
65.2
DelimScaling
Backbone=Qwen2.5-VL, M...
2026.02
63.3
Qwen2.5-VL
Model Size=3B, Method...
2026.02
62.7
DelimScaling
Backbone=LLaVA-OV, Mod...
2026.02
51.9
LLaVA-OV
Model Size=0.5B, Metho...
2026.02
51.7
InternVL3
Model Size=1B, Method...
2026.02
50.8
DelimScaling
Backbone=InternVL3, Mo...
2026.02
50.2
Feedback
Search any
task
Search any
task