Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Standard Multimodal Evaluation Suite

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multimodal UnderstandingStandard Multimodal Evaluation Suite (GQA, MMBench, MME, TextVQA, ScienceQA, VQA v2) 1.5 (test val)
GQA Score63.2
32
Multimodal Question Answering and UnderstandingStandard Multimodal Evaluation Suite GQA, MMB, MME, VQA-T, SQA-I, VQA-v2, POPE, MMMU, MM-Vet
GQA Accuracy61.9
26
Showing 2 of 2 rows