Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MSRVTT

Benchmarks

Task NameDataset NameSOTA ResultTrend
Video Question AnsweringMSRVTT-QA
Accuracy72.4
481
Video Question AnsweringMSRVTT-QA (test)
Accuracy88.2
371
Text-To-Video retrievalMSRVTT (test)
Recall@118.2
155
Video CaptioningMSRVTT
CIDEr80.3
101
Text-to-Video RetrievalMSRVTT
R@163.9
98
Text-to-video retrievalMSRVTT
R@161
75
Text-to-Video RetrievalMSRVTT 1k (test)
Recall@1087.4
63
Video CaptioningMSRVTT (test)
CIDEr80.5
61
Video CaptioningMSRVTT
CIDEr80.3
61
Video Question AnsweringMSRVTT-MC
Accuracy97.7
61
Text-to-Video RetrievalMSRVTT
Recall@149.9
48
Video Question AnsweringMSRVTT
Accuracy66.7
46
Text-to-Video RetrievalMSRVTT (1K-A)
R@149.3
42
Video GenerationMSRVTT (val)
FVD414
40
Text-to-Video RetrievalMSRVTT (UTD)
Recall@131.1
34
Text-to-Video RetrievalMSRVTT full (test val)
Recall@143.6
34
Video Question AnsweringMSRVTT-MC (test)
Accuracy97.8
31
Text-to-Video RetrievalMSRVTT (MSR) zero-shot
R@142.6
26
Video Question AnsweringMSRVTT (test)
Accuracy92.7
26
Video-to-Text RetrievalMSRVTT
R@150.1
24
Text-to-Video RetrievalMSRVTT 1K-A (test)
R@154.2
23
Text-to-Video RetrievalMSRVTT 1K 1.0 (test)
R@140.9
23
Video captioningMSRVTT (full)
CIDEr75.9
20
Image-to-Video RetrievalMSRVTT I2V
Recall@192.4
18
Video-Text RetrievalMSRVTT
GFLOPS44.7
18
Showing 25 of 56 rows