Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MMBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multimodal UnderstandingMMBench
Accuracy90.6
637
Multimodal Model EvaluationMMBench
Accuracy87.8
180
Multimodal UnderstandingMMBench CN
Accuracy88.5
174
Multimodal Model EvaluationMMBench Chinese
Accuracy82.6
154
Multimodal UnderstandingMMBench (MMB)
Accuracy86.3
141
Vision UnderstandingMMBench
Accuracy85
141
Multimodal BenchmarkingMMBench-CN
Score92.39
129
Multimodal BenchmarkingMMBench English
Accuracy70.4
125
Multimodal EvaluationMMBench
MMB Score79.7
118
Multimodal EvaluationMMBench CN
Accuracy74.3
83
Multimodal BenchmarkMMBench (MMB)
Accuracy81.8
81
Multimodal ReasoningMMBench
Overall Score88.15
78
Visual Question AnsweringMMBench (MMB)
Accuracy92.1
76
Multimodal UnderstandingMMBench Chinese
MMB Benchmark (CN)89.5
70
GUI GroundingMMBench-GUI L2 (test)
Average Error2.9
67
Multimodal UnderstandingMMBench (test)
Accuracy84.2
67
Multi-modal UnderstandingMMBench EN
Accuracy88.3
64
Multimodal UnderstandingMMBench EN v1.1
Accuracy89.5
63
Multimodal BenchmarkingMMBench (MMB)
MMB Score65.4
62
Multimodal BenchmarkingMMBench
Score83.4
62
Visual Question AnsweringMMBench-CN
Accuracy93.13
62
Multimodal BenchmarkingMMBench
Accuracy84.4
58
Multimodal UnderstandingMMBench (dev)
Accuracy80.41
58
Multi-modal Question AnsweringMMBench
Accuracy86.4
55
Multi-modal UnderstandingMMBench EN
Overall Score86.3
55
Showing 25 of 153 rows