Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

VSTAR

Benchmarks

Task NameDataset NameSOTA ResultTrend
General Reasoning & UnderstandingVStar
Accuracy82.1
21
Fine-grained Visual PerceptionVStar
VStar Score82.72
18
Visual Search and ReasoningVStar
Score76.96
18
Multimodal PerceptionVStar
Accuracy92.67
18
Visual PerceptionVStar (test)
Accuracy92.7
15
Visual UnderstandingVStar
Accuracy85.86
11
Perceptual RobustnessVStar
Overall Accuracy80.25
9
Video-grounded Dialogue GenerationVSTAR (test)
BLEU-10.092
9
Dialogue Topic SegmentationVSTAR
WinDif0.765
7
Dialogue Scene SegmentationVSTAR (test)
mIoU53.6
7
Showing 10 of 10 rows