| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Perception | MMVP | Accuracy76.67 | 82 | |
| Multimodal Visual Perception | MMVP | Accuracy85.33 | 72 | |
| Visual Question Answering | MMVP | Accuracy79.3 | 36 | |
| Vision Understanding | MMVP | Accuracy86.33 | 33 | |
| Visual Reasoning | MMVP | Accuracy86.3 | 32 | |
| Detail Perception | MMVP VLM | Orientation and Direction Accuracy26.7 | 27 | |
| Multimodal Visual Pattern Understanding | MMVP | Accuracy80.33 | 25 | |
| Vision-centric Reasoning | MMVP | Accuracy86.3 | 21 | |
| Visual Pattern Recognition | MMVP | Accuracy78.7 | 19 | |
| Spatial Understanding | MMVP | Accuracy77 | 15 | |
| Fine-Grained Perception | MMVP | Accuracy74.67 | 14 | |
| Hallucination | MMVP | Accuracy72.1 | 13 | |
| Visual Question Answering | MMVP | Sentence Faithfulness (Insertion)0.8052 | 12 | |
| Vision-Centric Evaluation | MMVP | Score65.2 | 12 | |
| Multimodal Visual Pattern Understanding | MMVP-VLM (test) | Orientation & Direction Acc0.267 | 12 | |
| Fine-grained Perception | MMVP (test) | MMVP Score75.33 | 11 | |
| Perception | MMVP (test) | Accuracy68.7 | 11 | |
| Fine-grained Visual Pattern Recognition | MMVP-VLM | Orientation Score60 | 11 | |
| Multimodal Multi-choice | MMVP | Accuracy75.3 | 10 | |
| Visual Question Answering | MMVP-VLM | Orientation & Direction Score26.7 | 10 | |
| Multimodal Visual Pattern Recognition | MMVP | MMVP Score75.3 | 9 | |
| Visual-centric Reasoning | MMVP | Average Score28.9 | 9 | |
| General Reasoning & Understanding | MMVP | Accuracy47.3 | 8 | |
| Visual Perception & Contextual Understanding | MMVP-VLM | Average Score25.9 | 7 | |
| Multimodal Reasoning | MMVP (test) | UPR0.118 | 6 |