| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Video Question Answering | VideoEspresso | Accuracy83.6 | 24 | |
| Video Question Answering | VideoEspresso 1.0 | Narrative Accuracy51.9 | 13 | |
| Multi-Image Reasoning | VideoEspresso | Narration Score50.6 | 12 | |
| Video Reasoning | VideoEspresso extended | Narrative Accuracy63.64 | 10 | |
| Video Reasoning | VideoEspresso | Accuracy58.7 | 6 | |
| Video Question Answering | VideoEspresso (test) | Accuracy0.278 | 4 |