Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Image benchmarks

Benchmarks

Task NameDataset NameSOTA ResultTrend
Image UnderstandingImage benchmarks Aggregate
Overall Score64.82
21
Zero-shot Image UnderstandingDynamic-resolution Image Benchmarks (GQA, POPE, ScienceQA, MME, MMBench) (test)
GQA Score60.5
13
Multimodal Understanding and ReasoningImage Benchmarks HallBench, MME, TextVQA, ChartQA, AI2D, RealWorldQA, CCBench, OCRVQA, SQA-IMG, POPE
HallBench Score46.5
13
Showing 3 of 3 rows