Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-image Reasoning on Mantis
Loading...
71
Accuracy
DPS (Ours)
10.056
25.878
41.7
57.522
Jan 12, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
DPS (Ours)
Model Category=Multi-i...
2026.01
71
CcDPO
Model Category=Multi-i...
2026.01
69.1
VISC
Model Category=Multi-i...
2026.01
69.1
Two-stage RL (Ours)
Model Category=Multi-i...
2026.01
68.4
InternVL2.5
Model Category=Open-So...
2026.01
67.7
DAPO (Ours)
Model Category=Multi-i...
2026.01
67.7
Qwen2.5-VL
Model Category=Open-So...
2026.01
64.5
LLaVA-OneVision
Model Category=Open-So...
2026.01
64.2
mPLUG-Owl3
Model Category=Multi-i...
2026.01
63.1
GPT-4V
Model Category=Closed-...
2026.01
62.7
LLaVA-NeXT-Interleave
Model Category=Multi-i...
2026.01
62.7
MIA-DPO
Model Category=Multi-i...
2026.01
60.4
Mantis-Idefics2
Model Category=Multi-i...
2026.01
57.1
VideoRFT
Model Category=Multi-i...
2026.01
56.7
VILA1.5
Model Category=Open-So...
2026.01
51.2
TW-GRPO
Model Category=Multi-i...
2026.01
49.8
LLaVA 1.6
Model Category=Open-So...
2026.01
45.6
OpenFlamingo-v2
Model Category=Open-So...
2026.01
12.4
Feedback
Search any
task
Search any
task