The GAN is dead; long live the GAN! A Modern GAN Baseline
About
There is a widely-spread claim that GANs are difficult to train, and GAN architectures in the literature are littered with empirical tricks. We provide evidence against this claim and build a modern GAN baseline in a more principled manner. First, we derive a well-behaved regularized relativistic GAN loss that addresses issues of mode dropping and non-convergence that were previously tackled via a bag of ad-hoc tricks. We analyze our loss mathematically and prove that it admits local convergence guarantees, unlike most existing relativistic losses. Second, our new loss allows us to discard all ad-hoc tricks and replace outdated backbones used in common GANs with modern architectures. Using StyleGAN2 as an example, we present a roadmap of simplification and modernization that results in a new minimalist baseline -- R3GAN. Despite being simple, our approach surpasses StyleGAN2 on FFHQ, ImageNet, CIFAR, and Stacked MNIST datasets, and compares favorably against state-of-the-art GANs and diffusion models.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Unconditional Image Generation | CIFAR-10 unconditional | FID1.96 | 159 | |
| Image Generation | ImageNet 64x64 | FID2.09 | 114 | |
| Image Generation | ImageNet 64x64 (train val) | FID2.09 | 83 | |
| Image Generation | CIFAR-10 (train/test) | FID1.96 | 78 | |
| Image Generation | FFHQ 64x64 (test) | FID1.95 | 69 | |
| Image Generation | FFHQ 256x256 (test) | FID2.75 | 30 | |
| Classification | MSTAR 2-shot | Precision51.66 | 25 | |
| Classification | MSTAR 4-shot | Precision56.5 | 25 | |
| Classification | MSTAR 8-shot | Precision62.47 | 25 | |
| Unconditional Image Generation | StackedMNIST 1000-mode (test) | # Modes1.00e+3 | 11 |