| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Temporal Video Grounding | Charades-STA (test) | Recall@IoU=0.597 | 124 | |
| Video Grounding | Charades-STA | R@1 IoU=0.575.3 | 113 | |
| Temporal Grounding | Charades-STA | R@0.575 | 88 | |
| Video Moment Retrieval | Charades STA (test) | Recall@1 (IoU=0.5)70.65 | 77 | |
| Action Recognition | Charades (val) | mAP63.6 | 69 | |
| Action Recognition | Charades | mAP0.6229 | 64 | |
| Action Recognition | Charades (test) | mAP0.663 | 53 | |
| Activity Detection | Charades localize v1 | mAP28.6 | 52 | |
| Action Recognition | Charades v1 (test) | mAP45.2 | 52 | |
| Temporal Video Grounding | Charades-STA | Rank-1 Recall (IoU=0.5)70.3 | 47 | |
| Video Moment Retrieval | Charades-STA | R1@0.571.26 | 44 | |
| Video Classification | Charades | mAP59.8 | 38 | |
| Action Detection | Charades (test) | PAC30 | 27 | |
| Video Temporal Grounding | Charades-TimeLens | R1@0.376.6 | 24 | |
| Temporal Grounding | Charades-CON | Ground Score83.3 | 21 | |
| Activity Detection | Charades (val) | mAP26.95 | 21 | |
| Video Grounding | CharadesSTA | Accuracy (CharadesSTA)61.4 | 19 | |
| Text-to-video Retrieval | Charades (test) | R@126.7 | 19 | |
| Activity Detection | Charades (test) | mAP27.8 | 19 | |
| Video Temporal Grounding | Charades-STA g43_s7 (test) | R@0.560.6 | 18 | |
| Video Retrieval | Charades-STA (evaluation) | R@12.7 | 17 | |
| Temporal Grounding | Charades | mIoU67.7 | 15 | |
| Person Identification | Charades-AB (same-activity) | Rank 145.84 | 15 | |
| Temporal Activity Detection | Charades v1_localize (val) | mAP28.79 | 15 | |
| Multi-label Video Classification | Charades | mAP50.4 | 15 |