Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

A Good Image Generator Is What You Need for High-Resolution Video Synthesis

About

Image and video synthesis are closely related areas aiming at generating content from noise. While rapid progress has been demonstrated in improving image-based models to handle large resolutions, high-quality renderings, and wide variations in image content, achieving comparable video generation results remains problematic. We present a framework that leverages contemporary image generators to render high-resolution videos. We frame the video synthesis problem as discovering a trajectory in the latent space of a pre-trained and fixed image generator. Not only does such a framework render high-resolution videos, but it also is an order of magnitude more computationally efficient. We introduce a motion generator that discovers the desired trajectory, in which content and motion are disentangled. With such a representation, our framework allows for a broad range of applications, including content and motion manipulation. Furthermore, we introduce a new task, which we call cross-domain video synthesis, in which the image and motion generators are trained on disjoint datasets belonging to different domains. This allows for generating moving objects for which the desired video data is not available. Extensive experiments on various datasets demonstrate the advantages of our methods over existing video generation techniques. Code will be released at https://github.com/snap-research/MoCoGAN-HD.

Yu Tian, Jian Ren, Menglei Chai, Kyle Olszewski, Xi Peng, Dimitris N. Metaxas, Sergey Tulyakov• 2021

Related benchmarks

TaskDatasetResultRank
Video GenerationUCF-101 (test)
Inception Score33.95
105
Video GenerationUCF101
FVD838
54
Video GenerationSkyTimelapse
FVD16164.1
21
Class-conditioned Video GenerationUCF101 (test)
Fréchet Video Distance700
19
Video GenerationUCF101 128x128 16 frames
Inception Score32.36
17
Video GenerationSkyTimelapse (test)
FVD16321.4
16
Video GenerationSkyTimelapse 256x256 (test)
FVD164.1
14
Video GenerationTaiChi-HD 128x128 (test)
FVD991
14
Long Video GenerationUCF-101 128-frame (test)
FVD1.62e+3
13
Video GenerationUCF101 256x256 (test)
FVD1.73e+3
13
Showing 10 of 39 rows

Other info

Code

Follow for update