| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Video Question Answering | MMVU | Accuracy58.5 | 51 | |
| Video Understanding | MMVU | Accuracy65.8 | 25 | |
| Multimodal Understanding | MMVU | Accuracy75.4 | 18 | |
| Video Reasoning | MMVU mc | Score82.6 | 16 | |
| Video Question Answering | MMVU (val) | Accuracy75.4 | 15 | |
| Video Multimodal Understanding | MMVU | Accuracy68.6 | 15 | |
| Video Question Answering | MMVU mc | Accuracy75.4 | 9 | |
| Video Understanding | MMVU (val) | Score46 | 9 | |
| Video Question Answering | MMVU (test) | Accuracy75.96 | 7 | |
| Expert-level Understanding | MMVU MC | Accuracy75.4 | 7 | |
| Multi-modal Video Understanding | MMVU | Score67 | 6 | |
| Video Question Answering | MMVU | M-Avg67.2 | 5 | |
| Video reasoning | MMVU | Accuracy75.8 | 3 |