Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

StarGAN v2: Diverse Image Synthesis for Multiple Domains

About

A good image-to-image translation model should learn a mapping between different visual domains while satisfying the following properties: 1) diversity of generated images and 2) scalability over multiple domains. Existing methods address either of the issues, having limited diversity or multiple models for all domains. We propose StarGAN v2, a single framework that tackles both and shows significantly improved results over the baselines. Experiments on CelebA-HQ and a new animal faces dataset (AFHQ) validate our superiority in terms of visual quality, diversity, and scalability. To better assess image-to-image translation models, we release AFHQ, high-quality animal faces with large inter- and intra-domain differences. The code, pretrained models, and dataset can be found at https://github.com/clovaai/stargan-v2.

Yunjey Choi, Youngjung Uh, Jaejun Yoo, Jung-Woo Ha• 2019

Related benchmarks

TaskDatasetResultRank
Image-to-Image TranslationRetinal Fundus-to-Angiogram (test)
FID26.7
42
Image-to-Image TranslationCelebA-HQ
FID32.16
28
Unpaired Image-to-Image TranslationCat → Dog v1 (test)
FID54.88
14
Reference-guided image synthesisAFHQ (test)
FID19.78
13
Reference-guided image synthesisCelebA-HQ (test)
FID19.58
12
SegmentationChest MRI to CT
Accuracy90.7
10
SegmentationRetinal OCT
Accuracy75.4
10
SegmentationCardiac MRI
Accuracy94.4
10
latent-guided image synthesisCelebA-HQ
FID13.7
9
Image SynthesisRetinal OCT (test)
FID174.2
9
Showing 10 of 50 rows

Other info

Code

Follow for update