Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MMBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multimodal UnderstandingMMBench
Accuracy88.8
367
Multimodal Model EvaluationMMBench
Accuracy87.8
180
Multimodal UnderstandingMMBench CN
Accuracy88.5
162
Multimodal Model EvaluationMMBench Chinese
Accuracy82.6
121
Multimodal EvaluationMMBench
MMB Score79.7
118
Vision UnderstandingMMBench
Accuracy85
104
Multimodal BenchmarkingMMBench-CN
Score92.39
73
Multimodal BenchmarkMMBench (MMB)
Accuracy81.8
70
Multimodal UnderstandingMMBench Chinese
MMB Benchmark (CN)89.5
70
Multimodal UnderstandingMMBench (MMB)
Accuracy86.3
69
Multimodal UnderstandingMMBench (test)
Overall Score81.2
65
Multimodal BenchmarkingMMBench
Score83.4
62
Multimodal BenchmarkingMMBench English
Accuracy70.4
61
Multimodal UnderstandingMMBench (dev)
Accuracy80.41
58
Multimodal EvaluationMMBench CN
Accuracy74.3
57
Multimodal UnderstandingMMBench English
MMB Score90.8
55
Multimodal ReasoningMMBench
Accuracy87
50
Multimodal Understanding (Chinese)MMBench Chinese
Accuracy91
47
Multimodal ReasoningMMBench (dev)
Accuracy87.6
47
GUI GroundingMMBench-GUI L2 (test)
Error (Windows, Basic)1.5
46
Multi-modal BenchmarkMMBench
Accuracy83.3
40
Visual Question AnsweringMMBench-CN
Accuracy93.13
40
Multi-modal UnderstandingMMBench (dev)
Overall Score80.6
40
Multi-modal UnderstandingMMBench EN
Overall Score86.3
39
Multimodal UnderstandingMMBench en (dev)
Score84.2
38
Showing 25 of 116 rows