| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Action Recognition | Something-Something v2 (val) | Top-1 Accuracy79.8 | 535 | |
| Action Recognition | Something-Something V2 | Top-1 Accuracy77.5 | 341 | |
| Action Recognition | Something-Something v2 (test) | Top-1 Acc77 | 333 | |
| Action Recognition | Something-Something v1 (val) | Top-1 Acc67.3 | 257 | |
| Action Recognition | Something-Something v1 (test) | Top-1 Accuracy65.6 | 189 | |
| Action Recognition | Something-Something v2 (test val) | Top-1 Accuracy74.7 | 187 | |
| Video Classification | Something-Something V2 (test) | Top-1 Acc0.773 | 169 | |
| Action Recognition | Something-Something V1 | Top-1 Acc70 | 162 | |
| Video Action Classification | Something-Something v2 | Top-1 Acc86.1 | 139 | |
| Video Classification | Something-something v1 (test) | Top-1 Accuracy63.7 | 115 | |
| Video Classification | Something-something v1 (val) | Top-1 Acc54.7 | 75 | |
| Video Classification | Something-Something v2 (val) | Top-1 Acc73.3 | 69 | |
| Video Classification | Something-Something v2 | Top-1 Acc74.4 | 56 | |
| Action Recognition | Something-Something V1 (test val) | Top-1 Acc57.2 | 48 | |
| Video Recognition | Something-Something V1 | Accuracy52.5 | 27 | |
| Few-shot Video Classification | Something-Something V2 (Small) | Accuracy59.1 | 24 | |
| Visual World Modelling | Something-Something | GPT-4o Score7.37 | 18 | |
| Action Recognition | Something-Something (val) | Top-1 Accuracy51.6 | 18 | |
| Video-to-Text Retrieval | Something-Something CiA-Retrieval v2 | R@1 (Chiral)84 | 16 | |
| Text-to-Video Retrieval | Something-Something CiA-Retrieval v2 | mAP (Chiral)85.1 | 16 | |
| Few-shot video recognition | Something-Something V2 | Top-1 Acc (K=2)9.1 | 13 | |
| Video Recognition | Something-Something V2 | Base Score22.2 | 13 | |
| Video Classification | Something-Something V1 | Top-1 Acc61.3 | 13 | |
| Action Recognition | Something-Something V2 | Base Score16.6 | 13 | |
| Action Classification | Something-Something V2 (val) | Top-1 Acc (multi-view fusion)77 | 11 |