| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Video Question Answering | MSVD-QA | Accuracy80.3 | 340 | |
| Video Question Answering | MSVD-QA (test) | Accuracy87.8 | 274 | |
| Video Question Answering | MSVD-QA zero-shot (test) | Accuracy81.5 | 56 | |
| Video Question Answering | MSVD-QA LLaVA-Hound out-of-domain (test) | Accuracy67.2 | 11 | |
| Video Question Answering | MSVD-QA random 1000 videos subset (test) | Accuracy39.3 | 2 |