Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA Vision-Language Understanding benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
Vision-Language Understanding
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
MMBench
Qwen3-VL-4B-Instruct
Accuracy
88.7
64
8d ago
MM-Vet
REVIS
Total Score
72.16
43
3mo ago
Vision-Language Benchmark Suite Aggregate
Vanilla
Aggregate Performance (%)
100
34
1mo ago
MMBench cn
Vanilla
Accuracy
60.6
24
1mo ago
MME
Vanilla
Average Score
100
18
1mo ago
Naturalbench
SemDeDup
General Score
13.2
13
8d ago
VStarBench
Qwen3-VL-8B
Accuracy
83.77
12
14d ago
MVerse VI
Qwen3-VL-8B
Accuracy
43.78
12
14d ago
MVerse VO
Qwen3-VL-8B
Accuracy
39.97
12
14d ago
MVista
Qwen3-VL-8B
Accuracy
75.9
12
14d ago
NegBench VOC2007
InternVL-3.0-8B
Accuracy
95.37
11
26d ago
NegBench COCO
InternVL-3.0-8B
Accuracy
93.19
11
26d ago
SEED-Bench Image
Vanilla
Average Accuracy
100
11
1mo ago
Vision-Language Evaluation Suite MMB, MMStar, MMMU, Hallusion, AI2D, OCR, SEED, SQA (test val)
Qwen2-VL-7B (Teacher)
MMB Score
80.7
10
3mo ago
MMStar (test)
ETTC
Accuracy
63.73
7
2d ago
MMBench CN v1.1
Qwen3-Omni
Accuracy
87.7
5
1mo ago
MMBench EN v1.1
MiniCPM-o 4.5
Accuracy
89
5
1mo ago
Winoground
CROCScore
Text Accuracy
61.5
5
1mo ago
MMBench (test)
Baseline
Overall Accuracy
78.8
4
1d ago
Quantiphy
Phi-4-Multimodal
Average MRA
27.4
4
22d ago
MMBench (dev)
Prompt Highlighter
Accuracy
69.7
4
3mo ago
MME Perception
Prompt Highlighter
MME Score
1,552.5
4
3mo ago
Vision-Language Evaluation Suite (ChartQA, DocVQA, AI2D, VQA, AndroidControl, CountBenchQA)
Our Method
ChartQA Accuracy
68.1
2
2mo ago
Vision-Language Benchmarks Hard Partition
VISOR
ChartQA Score
78.1
2
2mo ago
Vision-Language Benchmarks Easy Partition
Qwen2-VL-2B
RealWorldQA Accuracy
61.1
2
2mo ago
Showing 25 of 25 rows
25 / page
50 / page
100 / page
1
Search any
task
Search any
task
Privacy Policy
Terms of Service
FAQs
Swarm Docs