Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-context Multimodal Understanding on MMLong
Loading...
42.3
Overall Score
GPT-4o
11.516
19.508
27.5
35.492
May 13, 2026
Overall Score
Updated 19d ago
Evaluation Results
Method
Method
Links
Overall Score
GPT-4o
Size=-
2026.05
42.3
VL-Scaler-MiMO
Size=7B
2026.05
39.5
VL-Scaler
Size=7B
2026.05
33.1
Docopilot
Size=8B
2026.05
28.8
GPT-4o-mini
Size=-
2026.05
28.6
MiMO-VL-Instruct
Size=7B
2026.05
27.5
Qwen2.5-VL-Instruct
Size=72B
2026.05
24.9
Pixel Reasoner
Size=7B
2026.05
22
Qwen2.5-VL-Instruct
Size=7B
2026.05
21.2
mPLUG-Owl3
Size=7B
2026.05
21
VL-Rethinker
Size=7B
2026.05
20.9
Llava-OV
Size=7B
2026.05
19.5
OpenVLThinker
Size=7B
2026.05
18.6
DeepEyes
Size=7B
2026.05
17.5
R1-VL
Size=7B
2026.05
12.7
Feedback
Search any
task
Search any
task