Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MM-Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multimodal EvaluationMM-Bench
Accuracy83
57
Multimodal UnderstandingMM-Bench en (test)
Accuracy83.9
27
Multimodal UnderstandingMM-Bench cn (test)
Accuracy79.2
19
Multimodal BenchmarkingMM-Bench 37
Accuracy71.5
19
Multimodal UnderstandingMM-Bench
MBen Score87.64
16
Multimodal UnderstandingMM-Bench
Absolute Score66.1
14
Multi-modal UnderstandingMM-Bench-CN (MMBCN) (test)
MMBCN Score84
13
Multi-modal UnderstandingMM-Bench (MMB) (test)
MMB Score86.3
13
Multimodal UnderstandingMM-Bench (MMB) en (dev)
Accuracy85
12
Visual Language Model EvaluationMM-Bench CN
MMB (CN) Score57.5
7
Visual Language Model EvaluationMM-Bench EN
MM-Bench (EN) Score65.8
7
Showing 11 of 11 rows