| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Open-Ended VideoQA | ANet-QA | Accuracy64.4 | 34 | |
| T-to-V Retrieval | ANET | Recall@170.5 | 21 | |
| Question Answering | ANET | Accuracy50.4 | 3 | |
| Video-Audio Question Answering | ANET | Accuracy51 | 2 | |
| Text-to-Video-Audio Retrieval | ANET | R@168.5 | 2 |