| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Video Question Answering | VideoMMMU | Accuracy74.9 | 140 | |
| Video Reasoning | VideoMMMU | Accuracy68.33 | 89 | |
| Video Understanding | VideoMMMU | Accuracy68.64 | 59 | |
| Video Multimodal Understanding | VideoMMMU | Accuracy79.4 | 53 | |
| Video Reasoning | VideoMMMU comprehension | Accuracy61.33 | 27 | |
| Video Reasoning | VideoMMMU adaptation | Accuracy39 | 27 | |
| Open World Video Understanding | VideoMMMU | Average Accuracy71.2 | 19 | |
| Multimodal Video Understanding | VideoMMMU | Overall Score61.2 | 17 | |
| Long Video Understanding and Reasoning | VideoMMMU | Adaptation Score66 | 15 | |
| Professional-level Knowledge Acquisition | VideoMMMU | Accuracy83.6 | 13 | |
| Long Video Reasoning | VideoMMMU | Overall Score61.2 | 8 | |
| Video Reasoning | VideoMMMU (test) | Accuracy65.8 | 6 | |
| Video Understanding | VideoMMMU (test) | Overall Score61.2 | 5 | |
| Knowledge Acquisition | VideoMMMU | Delta Knowledge17.2 | 5 | |
| Video Multi-modal Multi-task Understanding | VideoMMMU | Score46.3 | 1 |