StarGAN v2: Diverse Image Synthesis for Multiple Domains

About

A good image-to-image translation model should learn a mapping between different visual domains while satisfying the following properties: 1) diversity of generated images and 2) scalability over multiple domains. Existing methods address either of the issues, having limited diversity or multiple models for all domains. We propose StarGAN v2, a single framework that tackles both and shows significantly improved results over the baselines. Experiments on CelebA-HQ and a new animal faces dataset (AFHQ) validate our superiority in terms of visual quality, diversity, and scalability. To better assess image-to-image translation models, we release AFHQ, high-quality animal faces with large inter- and intra-domain differences. The code, pretrained models, and dataset can be found at https://github.com/clovaai/stargan-v2.

Yunjey Choi, Youngjung Uh, Jaejun Yoo, Jung-Woo Ha• 2019

Related benchmarks

Task	Dataset	Result
Image-to-Image Translation	Retinal Fundus-to-Angiogram (test)	FID26.7	42
Image-to-Image Translation	CelebA-HQ	FID14.3	32
Unpaired Image-to-Image Translation	Cat → Dog v1 (test)	FID54.88	14
Reference-guided image synthesis	AFHQ (test)	FID19.78	13
Reference-guided image synthesis	CelebA-HQ (test)	FID19.58	12
Image-to-Image Translation	Edges to Rotated Shoes (test)	LPIPS0.39	12
Segmentation	Chest MRI to CT	Accuracy90.7	10
Segmentation	Retinal OCT	Accuracy75.4	10
Segmentation	Cardiac MRI	Accuracy94.4	10
latent-guided image synthesis	CelebA-HQ	FID13.7	9

Showing 10 of 53 rows

Other info

Code

Follow for update

@wizwand_team Discord