Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Adversarial Video Generation on Complex Datasets

About

Generative models of natural images have progressed towards high fidelity samples by the strong leveraging of scale. We attempt to carry this success to the field of video modeling by showing that large Generative Adversarial Networks trained on the complex Kinetics-600 dataset are able to produce video samples of substantially higher complexity and fidelity than previous work. Our proposed model, Dual Video Discriminator GAN (DVD-GAN), scales to longer and higher resolution videos by leveraging a computationally efficient decomposition of its discriminator. We evaluate on the related tasks of video synthesis and video prediction, and achieve new state-of-the-art Fr\'echet Inception Distance for prediction for Kinetics-600, as well as state-of-the-art Inception Score for synthesis on the UCF-101 dataset, alongside establishing a strong baseline for synthesis on Kinetics-600.

Aidan Clark, Jeff Donahue, Karen Simonyan• 2019

Related benchmarks

TaskDatasetResultRank
Video GenerationUCF-101 (test)
Inception Score32.97
105
Video PredictionBAIR (test)
FVD109.8
59
Video GenerationUCF101--
54
Video PredictionKinetics-600 (test)
FVD69.1
46
Video PredictionBAIR Robot Pushing
FVD109.8
38
Video PredictionBair
FVD109.8
34
Video PredictionBAIR Push (test)
FVD109.8
30
Video Frame PredictionKinetics-600
gFVD69.1
28
Class-conditioned Video GenerationUCF101 (test)--
19
Video PredictionKinetics-600
FVD69
18
Showing 10 of 19 rows

Other info

Follow for update