| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Video Question Answering | EgoSchema (Full) | Accuracy75 | 221 | |
| Video Question Answering | EgoSchema | Accuracy82.2 | 161 | |
| Video Understanding | EgoSchema | EgoSchema Score69.4 | 158 | |
| Video Question Answering | EgoSchema subset | Accuracy81 | 114 | |
| Video Question-Answering | EgoSchema (test) | Accuracy77.9 | 90 | |
| Multiple Choice Video Question Answering | EgoSchema | Accuracy72.2 | 61 | |
| Video Understanding | EgoSchema (test) | Accuracy77.9 | 55 | |
| Video Question Answering | EgoSchema 500-question subset | Accuracy71.2 | 50 | |
| Egocentric Video Understanding | EgoSchema | Score61.4 | 42 | |
| Long-form Video Understanding | EgoSchema | Accuracy72.2 | 38 | |
| Egocentric Video Understanding | EgoSchema (test) | Accuracy75.6 | 28 | |
| Video Question Answering | EgoSchema 5031 videos (test) | Top-1 Accuracy62.4 | 26 | |
| Multi-choice Video Question Answering | EgoSchema (test) | Accuracy72.2 | 26 | |
| Long-form Egocentric Video Understanding | EgoSchema | Accuracy78.2 | 25 | |
| Long-form Video Question Answering | EgoSchema | Accuracy77.9 | 24 | |
| Video Question Answering | EgoSchema 3 min (test) | Accuracy66.2 | 18 | |
| Multiple-Choice Video QA | EgoSchema latest (test) | Accuracy72.2 | 17 | |
| Long Video Question Answering | EgoSchema (full set) | Accuracy55.6 | 17 | |
| Long Video Understanding | EgoSchema (val) | Accuracy77.2 | 16 | |
| Egocentric Video Question Answering | EgoSchema (public leaderboard) | Accuracy75 | 13 | |
| Long-video understanding | EgoSchema (dev) | Accuracy63.7 | 11 | |
| Video Question Answering | EgoSchema Zero-shot | Accuracy81.8 | 11 | |
| Long Video Understanding | EgoSchema (test) | Accuracy72.2 | 10 | |
| Multi-choice Video Question Answering | EgoSchema Subset 500 questions | Accuracy66.4 | 10 | |
| Question Answering | EgoSchema | Accuracy58.4 | 9 |