| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Video Question Answering | TGIF | Top-1 Acc79.1 | 33 | |
| Video Question Answering | TGIF-Frame (test) | Accuracy75.6 | 27 | |
| Video question answering | TGIF-Frame | Accuracy74.9 | 25 | |
| Video Question Answering | TGIF Transition | Accuracy0.991 | 18 | |
| Text-based Video Retrieval | TGIF (test) | R@14.5 | 12 | |
| Video Question Answering | TGIF Action | Accuracy95.5 | 10 | |
| Video Understanding | TGIF | Accuracy47.6 | 6 | |
| Open Ended Question Answering | TGIF | Accuracy0.7222 | 6 | |
| Video Question Answering | TGIF Transition (test) | Accuracy99.1 | 6 | |
| Video Question Answering | TGIF Action (test) | Accuracy97.9 | 6 |