Our new X account is live! Follow @wizwand_team for updates
Search any
task
Feedback
Search any
task
SOTA Vision-Language Reasoning benchmarks and papers with code | Wizwand
Our new X account is live! Follow @wizwand_team for updates
Home
/
Tasks
Vision-Language Reasoning
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
VL Reasoning Benchmarks
FRISM
MVista Score
74
28
3d ago
VL Reasoning Benchmarks MathVista, MVerse, MathVision, MMMU, R1-OV, MMStar
FRISM
MathVista Acc
79.8
25
3d ago
CODA-LM 1.0 (test)
Qwen2.5-VL-7B + Ours
Barrier
79.8
13
3d ago
CVBench
Qwen3-VL-8B + GRPO
Accuracy
86.16
12
3d ago
MMStar cleaned
Jigsaw + CARE
Score
77.59
10
3d ago
Winoground
LLaVA-1.5 13B
Simple Acc
59.88
9
3d ago
SugarCrepe (test)
Q4 system redistr (prop)
Simple Accuracy
62.75
7
3d ago
NaturalBench (test)
Q4 system redistr (prop)
Simple Accuracy
66.02
7
3d ago
MME (test)
No intervention baseline
Simple Accuracy
78.98
7
3d ago
HallusionBench (test)
Q4 system redistr (prop)
Simple Accuracy
53.31
7
3d ago
BEAF (test)
Q4 system redistr (prop)
Simple Accuracy
88.4
7
3d ago
Winoground
CLIP
Text Score
30.5
4
3d ago
CLEVR CoGenT (Split B)
NeSyCoCo
Accuracy
78.8
3
3d ago
CLEVR-CoGenT (Split A)
MDETR
Accuracy
99.7
3
3d ago
CLEVR-SYN hard (val)
NeSyCoCo
Accuracy
73.4
2
3d ago
CLEVR-SYN medium (val)
NeSyCoCo
Accuracy
81.2
2
3d ago
CLEVR-SYN easy (val)
NeSyCoCo
Accuracy
92.1
2
3d ago
SelfEval Benchmark (test)
CLIP
Attribute Binding
55.4
2
3d ago
Showing 18 of 18 rows
25 / page
50 / page
100 / page
1
Search any
task
Search any
task
Terms of Service
FAQs