| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Video Instance Segmentation | OVIS (val) | AP57.1 | 301 | |
| Temporal Grounding | OVIS (test) | R1@0.535.25 | 27 | |
| Video Instance Segmentation | OVIS | mAP49.9 | 23 | |
| Video Instance Segmentation | OVIS 2021 (val) | AP42.6 | 14 | |
| Spatial Grounding | OVIS (test) | HOTA22.73 | 12 | |
| Video Instance Segmentation | OVIS (test) | AP39.4 | 12 | |
| Video Instance Segmentation | OVIS 1.0 (val) | AP36.2 | 11 | |
| Video Instance Segmentation | OVIS 56 (val) | AP43.2 | 8 | |
| Video Multimodal Interpretation | OVIS (val) | AP28.7 | 7 | |
| Video Semantic Segmentation | OVIS | mAP18.6 | 6 | |
| Video Object Detection | OVIS (val) | AP53.2 | 6 | |
| Video Instance Segmentation | OVIS Sub-Sparse (val) | AP18.5 | 6 | |
| Video Instance Segmentation | OVIS (test val) | AP19.6 | 6 | |
| Open-vocabulary video instance segmentation | OVIS (val) | mAP60.5 | 5 | |
| Video Captioning | OVIS | METEOR21.4 | 5 | |
| Controllable Video Segmentation | OVIS (test) | Jaccard & F-measure74.6 | 3 | |
| Object Segmentation | OVIS | J&F Score86.8 | 2 | |
| Segment Anything | OVIS | PAvPU86.9 | 2 |