Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

WebVid

Benchmarks

Task NameDataset NameSOTA ResultTrend
Composed Video RetrievalWebVid-CoVR (test)
R@15,982
45
Video ReconstructionWebVid 10M
PSNR35.76
34
Video ReconstructionWebvid (val)
PSNR34.75
16
Video Captioning EvaluationWebVid 10M
CLIP Score61.377
12
Video AnnotationWebVid-10M
Avg Length214.49
12
Video reconstructionWebVid-10M (val)
VCPR492.5
10
Video GenerationWebVid mini (val)
FVD @ 1 Frame526
10
Video GenerationWebVid (test)
LPIPS0.135
7
Camera-controlled video generationWebVid
RotErr3.162
5
Video Generationwebvid (test)
SubC92.8
5
Semantic ConsistencyWebVid10M
CLIP-F0.93
5
Image-to-Video GenerationWebVid (test)
FID29.94
4
Text-to-Video GenerationWebVid (test)
FID61.52
4
Video EditingWebVid
PSNR33.07
4
Image-to-Video generationWebVid 10M
Temporal Coherence96.9
4
Text-to-Video GenerationWebVid-10M 2-million
CLIP Score48.3
4
Text-to-Video GenerationWebVid-10M (val)
FVD292.35
4
Image-to-Video generation with fine-grained motion controlWebVid-10M 1k (val)
FVD59.88
3
Multi-subject and motion customizationWebVid subject pairs (test)
CLIP Text Alignment Score0.662
3
Geometry ConsistencyWebVid10M
Rot. AUC @ 5°25.2
3
Transition Video GenerationWebvid10M (test)
LPIPS (First Frame)0.3794
3
Image-to-Video GenerationWebVid-10M (val)
F-Consistency (4)95.36
3
Text-to-Video GenerationWebVid10M
FID7.64
3
Text-guided Video EditingWebVid-10M (val)
Frame Consistency (CLIP Score)94.9
2
Showing 24 of 24 rows