| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| LongVideoBench (val) | LVAgent | Accuracy80 | 139 | 3d ago | |
| LongVideoBench | Score66.7 | 110 | 2d ago | ||
| MLVU | Qwen3-VL-235B-A22B | Accuracy83.8 | 72 | 3d ago | |
| LVBench | AVP | Accuracy74.8 | 63 | 3d ago | |
| Video-MME long 1.0 | Gemini 1.5 Pro | Accuracy (No Subs)67.4 | 45 | 3d ago | |
| MLVU (test) | VideoZoomer | Average Score55.8 | 41 | 3d ago | |
| Video-MME Overall | Accuracy87 | 39 | 3d ago | ||
| Video-MME Long | AVP | Accuracy81.9 | 37 | 3d ago | |
| MLVU (dev) | Qwen2.5-VL+AdaRETAKE | Score78.1 | 31 | 3d ago | |
| MLVU 3-120 min | Accuracy82.1 | 23 | 3d ago | ||
| MLVU multiple-choice task | LLaVA-Video + TV-RAG | Overall Accuracy73.4 | 21 | 3d ago | |
| VNBench | Retrieval E Accuracy91.33 | 21 | 3d ago | ||
| VideoMME | Accuracy81.3 | 21 | 3d ago | ||
| LongVideoBench 23sec-60 min | Accuracy74.4 | 19 | 3d ago | ||
| LongVideoBench (LongVB) | Accuracy69.2 | 17 | 3d ago | ||
| Video-MME (full) | Video-TwG | Overall Performance59.7 | 16 | 3d ago | |
| VideoMME Long split, 30-60 min | Accuracy65.3 | 15 | 3d ago | ||
| LVBench (val) | Score58.7 | 15 | 3d ago | ||
| LVBench 30-90 min | Accuracy69.2 | 13 | 3d ago | ||
| Video-MME w/o sub (full) | VideoARM | Score (Long)81.2 | 13 | 3d ago | |
| LV-Bench (test) | MovieChat | ER21.3 | 13 | 3d ago | |
| MINERVA | AVP | Accuracy65.6 | 9 | 3d ago | |
| LongTimeScope 300min+ | VideoMem* | Accuracy45.1 | 8 | 3d ago | |
| LongVideoBench Long | AVP | Accuracy70 | 7 | 3d ago | |
| VNBench (test) | Video-XL+TiFRe | Retrieval Score0.827 | 7 | 3d ago |