Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA Vision-Language Reasoning benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
Vision-Language Reasoning
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
VL Reasoning Benchmarks
FRISM
MVista Score
74
28
1mo ago
VL Reasoning Benchmarks MathVista, MVerse, MathVision, MMMU, R1-OV, MMStar
FRISM
MathVista Acc
79.8
25
1mo ago
SQA3D ScanNet scenes (test)
DINOv3 + SpatialBoost
BLEU-1
54.9
13
25d ago
ScanQA ScanNet scenes (test)
DINOv3 + SpatialBoost
BLEU-1
43.3
13
25d ago
CODA-LM 1.0 (test)
Qwen2.5-VL-7B + Ours
Barrier
79.8
13
1mo ago
CVBench
Qwen3-VL-8B + GRPO
Accuracy
86.16
12
1mo ago
MMStar cleaned
Jigsaw + CARE
Score
77.59
10
1mo ago
Winoground
LLaVA-1.5 13B
Simple Acc
59.88
9
1mo ago
SugarCrepe (test)
Q4 system redistr (prop)
Simple Accuracy
62.75
7
1mo ago
NaturalBench (test)
Q4 system redistr (prop)
Simple Accuracy
66.02
7
1mo ago
MME (test)
No intervention baseline
Simple Accuracy
78.98
7
1mo ago
HallusionBench (test)
Q4 system redistr (prop)
Simple Accuracy
53.31
7
1mo ago
BEAF (test)
Q4 system redistr (prop)
Simple Accuracy
88.4
7
1mo ago
SSRBench
SOLE-R1
General Score (SSRBench)
85.6
4
18d ago
nuScenes (reasoning)
Qwen1.5-0.5B
BERT F1 Score
67
4
1mo ago
Winoground
CLIP
Text Score
30.5
4
1mo ago
Vision-Language Reasoning Suite (MathVerse, MathVista, MathVision, MMMU-Pro, We-Math) (test)
PLM-HoneyBee-3B-GRPO
Average Accuracy
46.2
3
1mo ago
CLEVR CoGenT (Split B)
NeSyCoCo
Accuracy
78.8
3
1mo ago
CLEVR-CoGenT (Split A)
MDETR
Accuracy
99.7
3
1mo ago
CLEVR-SYN hard (val)
NeSyCoCo
Accuracy
73.4
2
1mo ago
CLEVR-SYN medium (val)
NeSyCoCo
Accuracy
81.2
2
1mo ago
CLEVR-SYN easy (val)
NeSyCoCo
Accuracy
92.1
2
1mo ago
SelfEval Benchmark (test)
CLIP
Attribute Binding
55.4
2
1mo ago
Showing 23 of 23 rows
25 / page
50 / page
100 / page
1
Search any
task
Search any
task
Privacy Policy
Terms of Service
FAQs
Swarm Docs