Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Vision-Language Evaluation Suite

Benchmarks

Task NameDataset NameSOTA ResultTrend
Vision-Language UnderstandingVision-Language Evaluation Suite MMB, MMStar, MMMU, Hallusion, AI2D, OCR, SEED, SQA (test val)
MMB Score80.7
10
Vision-Language UnderstandingVision-Language Evaluation Suite (ChartQA, DocVQA, AI2D, VQA, AndroidControl, CountBenchQA)
ChartQA Accuracy68.1
2
Showing 2 of 2 rows