| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| TVPrediction | MUGEN | FVD57.2 | 22 | |
| Text-to-Video Retrieval | MUGEN n=500 | Accuracy86.9 | 4 | |
| Text-to-Video Retrieval | MUGEN n=250 | Accuracy85 | 4 | |
| Video-to-Text Retrieval | MUGEN n=500 | Accuracy86.6 | 4 | |
| Video-to-Text Retrieval | MUGEN n=250 | Accuracy83.7 | 4 | |
| Text-conditional Video Generation | MUGEN | FVD89.3 | 4 | |
| Unconditional video generation | MUGEN TVC (test) | FVD368.6 | 4 | |
| Video Prediction | MUGEN | Video Quality Score (Q.)2.03 | 3 | |
| Video Captioning | MUGEN-GAME (unseen) | BLEU-47.8 | 1 |