Share your thoughts, 1 month free Claude Pro on usSee more

Multimodal Benchmarking on MMB 1.1

82.2Accuracy

GPT-4o

Updated 2mo ago

Evaluation Results

Method	Links
GPT-4o 2025.09		82.2
DeFacto (Ours) 2025.09		81.2
Pixel Reasoner 2025.09		78.5
Visual-SR1 2025.09		77.4
Chain-of-Focus 2025.09		75.3
DeepEyes 2025.09		29.4