Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-modal Reasoning on MMBench Overall & Relation Reasoning
Loading...
84.7
Overall Accuracy
ChainMPQ
63.068
68.684
74.3
79.916
Oct 7, 2025
Overall Accuracy
Overall Accuracy Delta
RR Score
RR Score Delta
Updated 1mo ago
Evaluation Results
Method
Method
Links
Overall Accuracy
Overall Accuracy Delta
RR Score
RR Score Delta
ChainMPQ
Backbone=Qwen2.5-VL-7B
2025.10
84.7
1.5
81.8
3.6
ChainMPQ
Backbone=InternVL3-8B
2025.10
84.2
0.6
83.9
1.4
InternVL3-8B
Backbone=InternVL3-8B
2025.10
83.6
-
82.5
-
Qwen2.5-VL-7B
Backbone=Qwen2.5-VL-7B
2025.10
83.2
-
78.2
-
ChainMPQ
Backbone=LLaVA-v1.5-7B
2025.10
67.8
1.3
61.3
2.5
LLaVA-v1.5-7B
Backbone=LLaVA-v1.5-7B
2025.10
66.5
-
58.8
-
ChainMPQ
Backbone=InstructBLIP-7B
2025.10
65.5
1.6
55.2
2.7
InstructBLIP-7B
Backbone=InstructBLIP-7B
2025.10
63.9
-
52.5
-
Feedback
Search any
task
Search any
task