| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Multimodal Understanding | MMStar | Accuracy82 | 324 | |
| Multimodal Reasoning | MMStar | Accuracy82 | 143 | |
| Multimodal Evaluation | MMStar | Accuracy69.5 | 70 | |
| Visual Question Answering | MMStar | Accuracy82.96 | 63 | |
| Image Understanding | MMStar | Score65.1 | 54 | |
| General Visual Question Answering | MMStar | Score77.8 | 35 | |
| General Reasoning | MMStar | Score69.2 | 32 | |
| Multimodal Understanding | MMStar | Average Score68.01 | 31 | |
| Perception | MMStar latest (test) | CP67.2 | 30 | |
| General Visual Reasoning | MMStar | Accuracy77.5 | 29 | |
| Multimodal Reasoning | MMStar | Accuracy75.2 | 29 | |
| Multi-modal Visual Capability | MMStar | Score63.9 | 29 | |
| Visual Reasoning | MMStar | Accuracy68.2 | 27 | |
| General image understanding | MMStar | Accuracy62.33 | 23 | |
| LVLM Evaluation | MMStar | CP Score76.6 | 20 | |
| Visual Perception | MMStar | Accuracy65.7 | 20 | |
| Multimodal Reasoning and Perception | MMStar (test) | Accuracy63.9 | 19 | |
| Mathematical Reasoning | MMStar Math | Accuracy77.2 | 19 | |
| Compositional Reasoning | MMStar | Accuracy64.7 | 16 | |
| Multimodal Reasoning | MMstar | Pass@1 Accuracy67.1 | 16 | |
| Vision-Language Perception and Reasoning | MMStar | Accuracy (MMStar)39.9 | 16 | |
| Multimodal Understanding | MMStar (test) | Score63.9 | 16 | |
| Visual Understanding | MMStar | Accuracy (Clean)65.9 | 16 | |
| Multimodal Perception | MMStar | Accuracy83.6 | 16 | |
| Multimodal Integration | MMStar | Accuracy66.49 | 15 |