| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Multimodal Evaluation | MM-Bench | Accuracy83 | 57 | |
| Multimodal Understanding | MM-Bench en (test) | Accuracy83.9 | 27 | |
| Multimodal Understanding | MM-Bench cn (test) | Accuracy79.2 | 19 | |
| Multimodal Benchmarking | MM-Bench 37 | Accuracy71.5 | 19 | |
| Multimodal Understanding | MM-Bench | MBen Score87.64 | 16 | |
| Multimodal Understanding | MM-Bench | Absolute Score66.1 | 14 | |
| Multi-modal Understanding | MM-Bench-CN (MMBCN) (test) | MMBCN Score84 | 13 | |
| Multi-modal Understanding | MM-Bench (MMB) (test) | MMB Score86.3 | 13 | |
| Multimodal Understanding | MM-Bench (MMB) en (dev) | Accuracy85 | 12 | |
| Visual Language Model Evaluation | MM-Bench CN | MMB (CN) Score57.5 | 7 | |
| Visual Language Model Evaluation | MM-Bench EN | MM-Bench (EN) Score65.8 | 7 |