Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Semantic Bottleneck Scene Generation

About

Coupling the high-fidelity generation capabilities of label-conditional image synthesis methods with the flexibility of unconditional generative models, we propose a semantic bottleneck GAN model for unconditional synthesis of complex scenes. We assume pixel-wise segmentation labels are available during training and use them to learn the scene structure. During inference, our model first synthesizes a realistic segmentation layout from scratch, then synthesizes a realistic scene conditioned on that layout. For the former, we use an unconditional progressive segmentation generation network that captures the distribution of realistic semantic scene layouts. For the latter, we use a conditional segmentation-to-image synthesis network that captures the distribution of photo-realistic images conditioned on the semantic layout. When trained end-to-end, the resulting model outperforms state-of-the-art generative models in unsupervised image synthesis on two challenging domains in terms of the Frechet Inception Distance and user-study evaluations. Moreover, we demonstrate the generated segmentation maps can be used as additional training data to strongly improve recent segmentation-to-image synthesis networks.

Samaneh Azadi, Michael Tschannen, Eric Tzeng, Sylvain Gelly, Trevor Darrell, Mario Lucic• 2019

Related benchmarks

TaskDatasetResultRank
Unconditional Image GenerationBedrooms
FID48.75
8
Unconditional Image GenerationFFHQ
FID24.52
8
Unconditional Image GenerationCityscapes
FID54.92
8
Unconditional Image GenerationCLEVR
FID35.13
8
Unconditional Image GenerationCOCO
FID108.1
8
Unconditional Image GenerationCOCOp
FID104.3
8
Unconditional image synthesisCityscapes-25K (test)
FID54.92
6
Layout-to-Image GenerationCelebA
FID50.92
5
Unconditional image synthesisADE (Indoor)
FID85.27
5
Layout-to-Image GenerationCLEVR
FID18.02
5
Showing 10 of 17 rows

Other info

Code

Follow for update