| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Video Understanding | VideoMME | Score (Overall)75 | 357 | |
| Video Question Answering | VideoMME | Accuracy85.1 | 251 | |
| Video Understanding | VideoMME | Overall Score100 | 222 | |
| Long Video Understanding | VideoMME | Accuracy81.3 | 89 | |
| Multi-modal Video Understanding | VideoMME | Accuracy84.3 | 64 | |
| Video Question Answering | VideoMME (test) | Short Length Accuracy89.8 | 61 | |
| Video Understanding | VideoMME | Accuracy (No Subtitles)65.1 | 60 | |
| Video Question Answering | VideoMME Medium | Accuracy72.9 | 53 | |
| Video Question-Answering | VideoMME wo sub | Accuracy88.6 | 51 | |
| Video Question Answering | VideoMME 16 (test) | Medium Length Score70.11 | 45 | |
| Video Understanding | VideoMME (test) | Overall Score86.9 | 45 | |
| Video Question Answering | VideoMME | VQA Score (wo subs)75 | 45 | |
| Multi-modal Video Evaluation | VideoMME | Score65.1 | 42 | |
| Video Understanding | VideoMME | Accuracy71.9 | 30 | |
| Video Understanding | VideoMME w/o sub | Score77 | 29 | |
| Video Question Answering | VideoMME Overall | Accuracy73.6 | 29 | |
| Offline Video Understanding | VideoMME v1 (test) | Accuracy77.2 | 27 | |
| Long Video Understanding | VideoMME Long split, 30-60 min | Accuracy65.3 | 27 | |
| Video Understanding | VideoMME Long | Score59 | 25 | |
| General Multi-task Video Understanding | VideoMME w/o sub | Average Accuracy77.4 | 22 | |
| Video Understanding | VideoMME | Accuracy (Base)65.6 | 22 | |
| Long Video Understanding | VideoMME Overall (w/o sub.) 2025 | Accuracy75 | 21 | |
| Video Understanding | VideoMME no subtitles (test) | Accuracy71.9 | 21 | |
| Video Understanding | VideoMME v1.0 (test) | Score60.8 | 21 | |
| Video Question Answering | VideoMME w/ sub | Score87.8 | 21 |