Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multimodal Benchmarking on MMB 1.1
Loading...
82.2
Accuracy
GPT-4o
27.288
41.544
55.8
70.056
Sep 25, 2025
Accuracy
Updated 13d ago
Evaluation Results
Method
Method
Links
Accuracy
GPT-4o
2025.09
82.2
DeFacto (Ours)
Backbone=Qwen2.5-VL-7B
2025.09
81.2
Pixel Reasoner
2025.09
78.5
Visual-SR1
2025.09
77.4
Chain-of-Focus
2025.09
75.3
DeepEyes
2025.09
29.4
Feedback
Search any
task
Search any
task