Multimodal Benchmarks

Benchmarks

Task Name	Dataset Name	SOTA Result
Multimodal Understanding and Question Answering	Multimodal Benchmarks MME, OCRBench, DocVQA, RealWorldQA, VLMBlind	MME Score2,386	33
Multimodal Question Answering	9 Multimodal Benchmarks (VQAv2, GQA, VizWiz, SQA-IMG, TextVQA, POPE, MME, MMB, MMB-CN) (test val)	VQAv2 Accuracy80	15
Multimodal Understanding	Multimodal Benchmarks Aggregate	Relative Performance101.7	13
Multimodal Understanding	Average of 10 Multimodal Benchmarks	Average Score72.15	9
Multimodal In-context Learning	Multimodal Benchmarks Average	Accuracy67.2	9

Showing 5 of 5 rows