| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Video-MME | Accuracy87.8 | 82 | 1d ago | ||
| LVBench | Accuracy53.8 | 34 | 22d ago | ||
| MVBench Overall | PyraTok | Accuracy86.03 | 25 | 1mo ago | |
| PerceptionTest (PercTest) | PLM-7B | Accuracy82.7 | 21 | 1d ago | |
| NExT-QA | Accuracy86.3 | 21 | 1d ago | ||
| Combined (VideoMME, LVBench, LongVideoBench, EgoSchema, MLVU) | GPT-4o | Average Score64.9 | 11 | 22d ago | |
| NexT-QA (test) | LLaVA-Video-7B + ST-GridPool | Accuracy83.8 | 8 | 12d ago | |
| TempCompass (test) | NVILA-8B | Accuracy69.7 | 7 | 12d ago | |
| PerceptionTest | TS-LLaVA + PAS | Overall Score69.4 | 6 | 2mo ago | |
| TVBench | Seed1.8 | Accuracy71.5 | 6 | 2mo ago | |
| Aggregate VideoMME, EgoSchema, LongVideoBench, MLVU, VideoMMMU | EchoPrune | Average Score55.6 | 5 | 22d ago | |
| WildVideo | GPT-4o | Accuracy62.1 | 5 | 1mo ago | |
| TOMATO | GPT-4o | Accuracy37.7 | 5 | 3mo ago | |
| LongVideoBench | MECoT | Accuracy52.13 | 4 | 2mo ago | |
| VideoMME Long | Qwen2.5-VL-7B-Instruct | Accuracy41.6 | 2 | 2mo ago | |
| VidComposition | MECoT | Accuracy55.57 | 2 | 3mo ago |