Semantic Image Synthesis with Spatially-Adaptive Normalization

About

We propose spatially-adaptive normalization, a simple but effective layer for synthesizing photorealistic images given an input semantic layout. Previous methods directly feed the semantic layout as input to the deep network, which is then processed through stacks of convolution, normalization, and nonlinearity layers. We show that this is suboptimal as the normalization layers tend to ``wash away'' semantic information. To address the issue, we propose using the input layout for modulating the activations in normalization layers through a spatially-adaptive, learned transformation. Experiments on several challenging datasets demonstrate the advantage of the proposed method over existing approaches, regarding both visual fidelity and alignment with input layouts. Finally, our model allows user control over both semantic and style. Code is available at https://github.com/NVlabs/SPADE .

Taesung Park, Ming-Yu Liu, Ting-Chun Wang, Jun-Yan Zhu• 2019

Related benchmarks

Task	Dataset	Result
Semantic segmentation	Cityscapes (test)	mIoU92.1	1252
Semantic Image Synthesis	ADE20K	FID18.9	66
Semantic Image Synthesis	Cityscapes	FID51.98	54
Object Detection	ARCADE real (test)	mAP@0.547.5	49
Semantic Image Synthesis	COCO Stuff	FID22.6	49
Semantic Image Synthesis	Cityscapes (test)	LPIPS0.546	48
Object Detection	Multi-center Internal real (test)	mAP@0.566.5	48
Semantic Image Synthesis	ADE20K (val)	FID33.9	47
Semantic Image Synthesis	COCO Stuff (val)	FID22.6	42
Semantic Image Synthesis	CelebAMask-HQ	FID29.2	33

Showing 10 of 100 rows

...

Other info

Code

Follow for update

@wizwand_team Discord