Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
General Video Understanding on NexT-QA (test)
Loading...
83.8
Accuracy
LLaVA-Video-7B + ST-GridPool
67.68
71.865
76.05
80.235
May 21, 2026
Accuracy
Updated 12d ago
Evaluation Results
Method
Method
Links
Accuracy
LLaVA-Video-7B + ST-GridPool
Input frames=64, Pooli...
2026.05
83.8
LLaVA-Video-7B
Input frames=64
2026.05
83.2
NVILA-8B
Backbone=8B
2026.05
82.2
Oryx-1.5-7B
Backbone=1.5B
2026.05
81.8
LLaVA-OneVision-7B + ST-GridPool
Input frames=32, Pooli...
2026.05
79.6
LLaVA-OneVision-7B
Input frames=32
2026.05
79.4
mPLUG-Owl3-8B
Backbone=8B
2026.05
78.6
LongVA-7B
Backbone=7B
2026.05
68.3
Feedback
Search any
task
Search any
task