| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Violence Detection | Merged All-source (test) | Accuracy93.5 | 3 | |
| Sign Language Recognition | Merged-5565 | Top-1 Accuracy53.34 | 2 | |
| Vision-and-Language Navigation | Merged Unseen (test) | nDTW33.6 | 2 | |
| Vision-and-Language Navigation | Merged Unseen (dev) | nDTW36.3 | 2 | |
| Vision-and-Language Navigation | Merged Seen (test) | nDTW57.4 | 2 | |
| Vision-and-Language Navigation | Merged Seen (dev) | nDTW58.6 | 2 |