Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multimodal Reasoning on GQA (test)
Loading...
67.4
Accuracy
Ours
31.416
40.758
50.1
59.442
Apr 12, 2026
Accuracy
Average AR Steps
Average Time (s)
Updated 5d ago
Evaluation Results
Method
Method
Links
Accuracy
Average AR Steps
Average Time (s)
Ours
Model=Qwen2-VL 7B
2026.04
67.4
9.2
0.82
IVT-LR
Model=Qwen2-VL 7B
2026.04
65.8
10.1
0.68
Chain-of-Focus
Model=Qwen2-VL 7B
2026.04
61.8
128.6
3.01
CCoT
Model=Qwen2-VL 7B
2026.04
51.2
76.4
7.21
SCAFFOLD
Model=Qwen2-VL 7B
2026.04
48.7
72.8
6.72
Ours
Model=Chameleon 7B
2026.04
39.4
9.2
1.21
IVT-LR
Model=Chameleon 7B
2026.04
38.1
10.1
0.98
Chain-of-Focus
Model=Chameleon 7B
2026.04
34.6
360.4
2.98
CCoT
Model=Chameleon 7B
2026.04
33.1
150.6
5.31
SCAFFOLD
Model=Chameleon 7B
2026.04
32.8
156
4.17
Feedback
Search any
task
Search any
task