Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DiDeMo

Benchmarks

Task NameDataset NameSOTA ResultTrend
Text-to-Video RetrievalDiDeMo (test)
R@170.5
376
Text-to-Video RetrievalDiDeMo
R@132.4
360
Video-to-Text RetrievalDiDeMo
R@171.9
108
Video-to-Text RetrievalDiDeMo (test)
R@167.5
92
Text-to-Video RetrievalDiDeMo (DDM) full (test val)
Recall@146.3
34
Text-to-Video RetrievalDiDeMo (DDM) zero-shot
R@148.6
22
RetrievalDiDeMo T+A -> V
Recall@182.1
20
Video RetrievalDiDeMo
R@146.1
18
Video-Text RetrievalDIDEMO
GFLOPS44.5
18
Text-to-video retrievalDiDeMo (UTD-split)
Recall@135.6
17
Text-to-Video RetrievalDiDeMo 1K videos (test)
R@137
16
Zero-shot Retrieval (T+V → A)DiDeMo
Recall@10.695
14
Zero-shot Retrieval (T → A+V)DiDeMo
Recall@153.7
14
Text-to-video retrievalDiDeMo 28s (test)
R@138.1
11
Video Corpus Moment Retrieval (VCMR)DiDeMo 14 (test)
Recall@1 (IoU=0.5)2.26
11
Text-to-Video RetrievalDiDeMo 12 (full-corpus)
R@126
8
Text-to-Video RetrievalDiDeMo 12 (test)
R@145.3
8
Text-to-Video RetrievalDiDeMo (val)
R@153.9
8
Video Retrieval (clip-caption)DiDeMo (test)
R@120.4
7
Video RetrievalDiDeMo (test)
R@160
7
Text-to-Video RetrievalDiDeMo 1 (val)
R@149
6
Text-to-Video RetrievalDiDeMo CLIP-based (test)
R@148.4
5
Video-to-text retrievalDiDeMo (full)
R@146
5
Video GroundingDiDeMo (test)
R@1 (IoU=1.0)25.57
4
Video-to-Text RetrievalDiDeMo CLIP-based (test)
R@147.7
4
Showing 25 of 31 rows