Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Vision-Language Tasks

Benchmarks

Task NameDataset NameSOTA ResultTrend
Overall Vision-Language PerformanceVision-Language Tasks Aggregate
Targeted ASR86.8
18
Image CaptioningVision-Language Tasks Captioning
Targeted ASR68.9
18
Image ClassificationVision-Language Tasks Classification
Targeted ASR91
18
VQA (General)Vision-Language Tasks General VQA
Targeted ASR98.4
18
Showing 4 of 4 rows