Self-Attention Generative Adversarial Networks

About

In this paper, we propose the Self-Attention Generative Adversarial Network (SAGAN) which allows attention-driven, long-range dependency modeling for image generation tasks. Traditional convolutional GANs generate high-resolution details as a function of only spatially local points in lower-resolution feature maps. In SAGAN, details can be generated using cues from all feature locations. Moreover, the discriminator can check that highly detailed features in distant portions of the image are consistent with each other. Furthermore, recent work has shown that generator conditioning affects GAN performance. Leveraging this insight, we apply spectral normalization to the GAN generator and find that this improves training dynamics. The proposed SAGAN achieves the state-of-the-art results, boosting the best published Inception score from 36.8 to 52.52 and reducing Frechet Inception distance from 27.62 to 18.65 on the challenging ImageNet dataset. Visualization of the attention layers shows that the generator leverages neighborhoods that correspond to object shapes rather than local regions of fixed shape.

Han Zhang, Ian Goodfellow, Dimitris Metaxas, Augustus Odena• 2018

Related benchmarks

Task	Dataset	Result
Image Generation	CIFAR-10 (test)	FID13.4	536
Image Generation	CIFAR-10	--	178
Class-conditional Image Generation	ImageNet	FID18.65	174
Image Generation	LSUN church	FID6.15	117
Image Generation	LSUN bedroom	FID14.06	105
Image Generation	FFHQ	FID16.21	91
Image Generation	FFHQ	FID16.21	83
Image Generation	ImageNet 128x128	FID18.28	74
Image Generation	CIFAR10 (train)	FID0.45	71
Image Generation	LSUN bedroom	FID14.06	56

Showing 10 of 64 rows

Other info

Code

Follow for update

@wizwand_team Discord