Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

VEnhancer: Generative Space-Time Enhancement for Video Generation

About

We present VEnhancer, a generative space-time enhancement framework that improves the existing text-to-video results by adding more details in spatial domain and synthetic detailed motion in temporal domain. Given a generated low-quality video, our approach can increase its spatial and temporal resolution simultaneously with arbitrary up-sampling space and time scales through a unified video diffusion model. Furthermore, VEnhancer effectively removes generated spatial artifacts and temporal flickering of generated videos. To achieve this, basing on a pretrained video diffusion model, we train a video ControlNet and inject it to the diffusion model as a condition on low frame-rate and low-resolution videos. To effectively train this video ControlNet, we design space-time data augmentation as well as video-aware conditioning. Benefiting from the above designs, VEnhancer yields to be stable during training and shares an elegant end-to-end training manner. Extensive experiments show that VEnhancer surpasses existing state-of-the-art video super-resolution and space-time super-resolution methods in enhancing AI-generated videos. Moreover, with VEnhancer, exisiting open-source state-of-the-art text-to-video method, VideoCrafter-2, reaches the top one in video generation benchmark -- VBench.

Jingwen He, Tianfan Xue, Dongyang Liu, Xinqi Lin, Peng Gao, Dahua Lin, Yu Qiao, Wanli Ouyang, Ziwei Liu• 2024

Related benchmarks

TaskDatasetResultRank
Video Super-ResolutionUDM10 (test)
PSNR21.64
51
Video Super-ResolutionSPMCS (test)
Avg. PSNR19.272
36
Video RestorationREDS30
PSNR22.4
17
Video Super-ResolutionVideoGen30 (test)
Visual Quality2.686
10
Video RestorationREDS30 Spatio-Temporal Strong
PSNR19.75
10
Video RestorationREDS Spatio-Temporal Light 30
PSNR19.92
10
Video RestorationYouHQ40 Spatial Downsampling
PSNR22.18
10
Video RestorationYouHQ40 Spatio-Temporal Downsampling
PSNR21.85
10
Video RestorationYouHQ40 Spatio-Temporal Strong
PSNR20.55
10
Video RestorationUDM10 (test)
PSNR25.308
10
Showing 10 of 25 rows

Other info

Follow for update