Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multimodal Deep Search on MMBC
Loading...
13.8
Accuracy
Gemini-2.5 Pro
3.608
6.254
8.9
11.546
May 11, 2026
Accuracy
Updated 22d ago
Evaluation Results
Method
Method
Links
Accuracy
Gemini-2.5 Pro
Evaluation Setting=Age...
2026.05
13.8
Qwen3-VL-8B + ODE-RL
Evaluation Setting=Vis...
2026.05
12.5
Gemini-2.5 Flash
Evaluation Setting=Age...
2026.05
11.6
GPT-5
Evaluation Setting=Dir...
2026.05
11.2
Qwen3-VL-30B + ODE-RL
Evaluation Setting=Vis...
2026.05
11.2
Gemini-2.5 Pro
Evaluation Setting=Dir...
2026.05
10.3
Qwen3-VL-30B + ODE-SFT
Evaluation Setting=Vis...
2026.05
10.3
Qwen3-VL-8B + ODE-SFT
Evaluation Setting=Vis...
2026.05
8.5
Qwen3-VL-8B
Evaluation Setting=Vis...
2026.05
7.6
Qwen3-VL-30B
Evaluation Setting=Vis...
2026.05
7.1
Qwen3-VL-8B
Evaluation Setting=Age...
2026.05
6.2
Qwen3-VL-30B
Evaluation Setting=Age...
2026.05
6.2
Gemini-2.5 Flash
Evaluation Setting=Dir...
2026.05
4.9
Qwen3-VL-30B
Evaluation Setting=Dir...
2026.05
4.5
Qwen3-VL-8B
Evaluation Setting=Dir...
2026.05
4
Feedback
Search any
task
Search any
task