| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Text-to-Video Retrieval | ActivityNet Captions | R@173 | 56 | |
| Video Paragraph Captioning | ActivityNet Captions ae (val) | METEOR18.16 | 43 | |
| paragraph-to-video retrieval | ActivityNet Captions (test) | R@154.8 | 22 | |
| Video Captioning | ActivityNet Captions (val) | METEOR20.04 | 22 | |
| Video Grounding | ActivityNet Captions 1.3 (test val) | R@1 (IoU=0.5)37.01 | 21 | |
| Event Proposal Generation | ActivityNet Captions (val) | Recall Avg86.33 | 13 | |
| Video Paragraph Grounding | ActivityNet-Captions (test) | R@0.30.8189 | 12 | |
| Video Captioning | ActivityNet-Captions ae MART (test) | BLEU@317.43 | 9 | |
| Video-to-Text retrieval | ActivityNet Captions 5K (val) | R@142.4 | 6 | |
| Event Captioning | ActivityNet Captions v1.3 (test) | CIDEr33.01 | 5 | |
| Video Grounding | ActivityNet Captions (val 1) | R@1 (IoU=0.5)42.49 | 5 | |
| Dense Video Captioning | ActivityNet Captions no missings 1.0 (val) | BLEU@35.83 | 2 |