LLaVA-Video-7B† + Ours (Query-Conditioned Evidential Keyframe Sampling)
| Method | Links | |||||||
|---|---|---|---|---|---|---|---|---|
| 49.4 | 52.4 | 45.9 | 54.6 | 39 | 46.8 | 32.8 | ||
2026.04 | 48.5 | - | - | - | - | - | - | |
2026.04 | 47.9 | - | - | - | - | - | - | |
2026.04 | 47.8 | - | - | - | - | - | - | |
| 47.7 | 49.8 | 44.5 | 57 | 37.3 | 41.8 | 34.5 | ||
2026.04 | 47.6 | - | - | - | - | - | - | |
| 46.6 | 47.9 | 44.5 | 55 | 42.7 | 43.8 | 25.9 | ||
2026.04 | 46.4 | - | - | - | - | - | - | |
2026.04 | 45.3 | - | - | - | - | - | - | |
2026.04 | 45.3 | - | - | - | - | - | - | |
2026.04 | 45.3 | - | - | - | - | - | - | |
2026.04 | 45.2 | - | - | - | - | - | - | |
2026.04 | 45 | - | - | - | - | - | - | |
2026.04 | 43.4 | - | - | - | - | - | - | |
2026.04 | 43.3 | - | - | - | - | - | - | |
2026.04 | 43.3 | - | - | - | - | - | - | |
2026.04 | 43.3 | - | - | - | - | - | - | |
2026.04 | 43.1 | - | - | - | - | - | - | |
2026.04 | 43.1 | - | - | - | - | - | - | |
2026.04 | 42.9 | - | - | - | - | - | - | |
2026.04 | 42.8 | - | - | - | - | - | - | |
2026.04 | 42.4 | - | - | - | - | - | - | |
2026.04 | 42.3 | - | - | - | - | - | - | |
2026.04 | 41.7 | 41.5 | 40.2 | 42.3 | 33.2 | 49.8 | 29.3 | |
2026.04 | 41.3 | 43.7 | 40.7 | 37.8 | 38 | 46.2 | 27.3 | |
2026.04 | 41.3 | 42.8 | 39.1 | 34.9 | 38.7 | 38.2 | 48.8 | |
2026.04 | 39.4 | - | - | - | - | - | - | |
2026.04 | 39.1 | 38.7 | 39 | 36.8 | 37.3 | 39.8 | 29.3 | |
2026.04 | 38.4 | - | - | - | - | - | - | |
2026.04 | 37.6 | 36.8 | 38.6 | 40.6 | 32.7 | 37.3 | 29.3 | |
2026.04 | 37.4 | - | - | - | - | - | - | |
| 36.6 | - | - | - | - | - | - | ||
2026.04 | 29.3 | 28 | 30.3 | 28 | 29.3 | 28 | 36.4 | |
2026.04 | 28.8 | 30.3 | 25.1 | 26.5 | 27.7 | 31.9 | 25.5 | |
2026.04 | 22.5 | 21.3 | 23.1 | 25.9 | 22.3 | 24 | 17.2 |