Image Synthesis via Semantic Composition
About
In this paper, we present a novel approach to synthesize realistic images based on their semantic layouts. It hypothesizes that for objects with similar appearance, they share similar representation. Our method establishes dependencies between regions according to their appearance correlation, yielding both spatially variant and associated representations. Conditioning on these features, we propose a dynamic weighted network constructed by spatially conditional computation (with both convolution and normalization). More than preserving semantic distinctions, the given dynamic network strengthens semantic relevance, benefiting global structure and detail synthesis. We demonstrate that our method gives the compelling generation performance qualitatively and quantitatively with extensive experiments on benchmarks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic Image Synthesis | ADE20K | FID29.3 | 66 | |
| Semantic Image Synthesis | Cityscapes | FID49.5 | 54 | |
| Semantic Image Synthesis | ADE20K (val) | FID29.3 | 47 | |
| Semantic Image Synthesis | COCO Stuff (val) | FID18.1 | 42 | |
| Semantic Image Synthesis | COCO Stuff | FID18.1 | 40 | |
| Layout-to-Image Synthesis | Coco-Stuff (test) | -- | 25 | |
| Semantic Image Synthesis | CelebAMask-HQ | FID19.2 | 24 | |
| Layout-to-Image Synthesis | ADE20K (test) | LPIPS0.00e+0 | 7 | |
| Layout-to-Image Synthesis | COCO Stuff | FID18.1 | 7 | |
| Layout-to-Image Synthesis | ADE20K | FID29.3 | 7 |