Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

VSTAR

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multimodal PerceptionVStar
Accuracy92.67
18
Visual PerceptionVStar (test)
Accuracy92.7
15
Video-grounded Dialogue GenerationVSTAR (test)
BLEU-10.092
9
Dialogue Topic SegmentationVSTAR
WinDif0.765
7
Dialogue Scene SegmentationVSTAR (test)
mIoU53.6
7
Showing 5 of 5 rows