Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MMBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multimodal UnderstandingMMBench
Accuracy90.6
847
Multimodal UnderstandingMMBench CN
Accuracy88.5
254
Multimodal Model EvaluationMMBench
Accuracy87.8
204
Multimodal UnderstandingMMBench (MMB)
Accuracy86.3
166
Multimodal Model EvaluationMMBench Chinese
Accuracy82.6
154
Multimodal BenchmarkingMMBench-CN
Score92.39
151
Vision UnderstandingMMBench
Accuracy85
141
Multimodal ReasoningMMBench
Accuracy90.63
127
Multimodal ReasoningMMBench EN V1.1
Accuracy80.68
125
Multimodal BenchmarkingMMBench English
Accuracy70.4
125
Multimodal EvaluationMMBench CN
Accuracy82.37
120
Multimodal EvaluationMMBench
MMB Score79.7
118
Multimodal ReasoningMMBench CN
Accuracy82
113
Multi-modal UnderstandingMMBench EN
Accuracy93.53
105
Multimodal BenchmarkMMBench (MMB)
Accuracy81.8
95
Multimodal BenchmarkingMMBench
Accuracy84.4
90
Visual Question AnsweringMMBench (MMB)
Accuracy92.1
86
Multimodal UnderstandingMMBench Chinese
MMB Benchmark (CN)89.5
86
Multi-modal Question AnsweringMMBench
Accuracy86.4
84
Multimodal UnderstandingMMBench English
Accuracy88.79
81
Multimodal BenchmarkingMMBench
Score83.4
73
Visual Question AnsweringMMBench-CN
Accuracy93.13
72
GUI GroundingMMBench-GUI L2 (test)
Average Error2.9
67
Multimodal UnderstandingMMBench (test)
Accuracy84.2
67
Vision-Language UnderstandingMMBench
Accuracy88.7
64
Showing 25 of 198 rows
...