Diverse Semantic Image Synthesis via Probability Distribution Modeling
About
Semantic image synthesis, translating semantic layouts to photo-realistic images, is a one-to-many mapping problem. Though impressive progress has been recently made, diverse semantic synthesis that can efficiently produce semantic-level multimodal results, still remains a challenge. In this paper, we propose a novel diverse semantic image synthesis framework from the perspective of semantic class distributions, which naturally supports diverse generation at semantic or even instance level. We achieve this by modeling class-level conditional modulation parameters as continuous probability distributions instead of discrete values, and sampling per-instance modulation parameters through instance-adaptive stochastic sampling that is consistent across the network. Moreover, we propose prior noise remapping, through linear perturbation parameters encoded from paired references, to facilitate supervised training and exemplar-based instance style control at test time. Extensive experiments on multiple datasets show that our method can achieve superior diversity and comparable quality compared to state-of-the-art methods. Code will be available at \url{https://github.com/tzt101/INADE.git}
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic Image Synthesis | ADE20K | FID29.6 | 66 | |
| Semantic Image Synthesis | Cityscapes | FID38.04 | 54 | |
| Semantic Image Synthesis | CelebAMask-HQ | FID21.5 | 24 | |
| Semantic Image Synthesis | ADE20K (test) | FID48.6 | 20 | |
| Semantic Label to Face Generation | FFHQ | FID47.4 | 10 | |
| Semantic to Face Generation | CelebA | FID54.27 | 10 | |
| Semantic Image Synthesis | DeepFashion | Params (M)84.63 | 8 | |
| Semantic Image Synthesis | iDesigner (test) | PSNR12 | 6 | |
| Semantic Image Synthesis | CelebAMask-HQ (test) | PSNR12.24 | 6 |