Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multimodal Reasoning on MMBench v1.1 (test)
Loading...
82.2
Overall Score
Upper Bound
-2.52568
19.47041
41.4665
63.46259
Feb 4, 2026
Feb 15, 2026
Feb 27, 2026
Mar 10, 2026
Mar 22, 2026
Apr 2, 2026
Apr 14, 2026
Overall Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Overall Score
Upper Bound
Backbone=Qwen2.5-VL-7B
2026.04
82.2
CLASP
Backbone=Qwen2.5-VL-7B...
2026.04
78.6
SparseVLM
Backbone=Qwen2.5-VL-7B...
2026.04
78.4
CLASP
Backbone=Qwen2.5-VL-7B...
2026.04
76.2
SparseVLM
Backbone=Qwen2.5-VL-7B...
2026.04
75.6
CLASP
Backbone=Qwen2.5-VL-7B...
2026.04
72.7
SparseVLM
Backbone=Qwen2.5-VL-7B...
2026.04
72.1
Vanilla
Token Budget=1296 Toke...
2026.02
0.837
PIO-FVLM
Token Budget=Retain 33...
2026.02
0.821
DART
Token Budget=Retain 33...
2026.02
0.809
PIO-FVLM
Token Budget=Retain 22...
2026.02
0.809
DART
Token Budget=Retain 22...
2026.02
0.789
PIO-FVLM
Token Budget=Retain 11...
2026.02
0.776
DART
Token Budget=Retain 11...
2026.02
0.733
Feedback
Search any
task
Search any
task