Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long Video Understanding on VideoMME (test)
Loading...
64.2
Accuracy
NVILA-8B
52.136
55.268
58.4
61.532
May 21, 2026
Accuracy
Updated 12d ago
Evaluation Results
Method
Method
Links
Accuracy
NVILA-8B
Backbone=8B
2026.05
64.2
LLaVA-Video-7B + ST-GridPool
Input frames=64, Pooli...
2026.05
64.2
LLaVA-Video-7B
Input frames=64
2026.05
63.3
Apollo-7B
Backbone=7B
2026.05
61.3
LLaVA-OneVision-7B + ST-GridPool
Input frames=32, Pooli...
2026.05
59
Oryx-1.5-7B
Backbone=1.5B
2026.05
58.8
LLaVA-OneVision-7B
Input frames=32
2026.05
58.2
IXC-2.5-7B
Backbone=7B
2026.05
55.8
VideoLLaMA2.1-7B
Backbone=7B
2026.05
54.9
mPLUG-Owl3-8B
Backbone=8B
2026.05
53.5
LongVA-7B
Backbone=7B
2026.05
52.6
Feedback
Search any
task
Search any
task