Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

NoisyTwins: Class-Consistent and Diverse Image Generation through StyleGANs

About

StyleGANs are at the forefront of controllable image generation as they produce a latent space that is semantically disentangled, making it suitable for image editing and manipulation. However, the performance of StyleGANs severely degrades when trained via class-conditioning on large-scale long-tailed datasets. We find that one reason for degradation is the collapse of latents for each class in the $\mathcal{W}$ latent space. With NoisyTwins, we first introduce an effective and inexpensive augmentation strategy for class embeddings, which then decorrelates the latents based on self-supervision in the $\mathcal{W}$ space. This decorrelation mitigates collapse, ensuring that our method preserves intra-class diversity with class-consistency in image generation. We show the effectiveness of our approach on large-scale real-world long-tailed datasets of ImageNet-LT and iNaturalist 2019, where our method outperforms other methods by $\sim 19\%$ on FID, establishing a new state-of-the-art.

Harsh Rangwani, Lavish Bansal, Kartik Sharma, Tejan Karmali, Varun Jampani, R. Venkatesh Babu• 2023

Related benchmarks

TaskDatasetResultRank
Class-conditional Image GenerationImageNet-LT (val)
FID21.29
15
Image GenerationAnimalFace
FID16.15
11
Image GenerationImageNet Carnivore
FID13.65
6
Image GenerationCIFAR10-LT
FID17.74
5
Image GenerationiNaturalist 2019 (val)
FID11.46
5
Class-conditional Image GenerationiNaturalist 2019 (val)
FID11.46
4
Showing 6 of 6 rows

Other info

Code

Follow for update