Adversarial Generation of Continuous Images
About
In most existing learning systems, images are typically viewed as 2D pixel arrays. However, in another paradigm gaining popularity, a 2D image is represented as an implicit neural representation (INR) - an MLP that predicts an RGB pixel value given its (x,y) coordinate. In this paper, we propose two novel architectural techniques for building INR-based image decoders: factorized multiplicative modulation and multi-scale INRs, and use them to build a state-of-the-art continuous image GAN. Previous attempts to adapt INRs for image generation were limited to MNIST-like datasets and do not scale to complex real-world data. Our proposed INR-GAN architecture improves the performance of continuous image generators by several times, greatly reducing the gap between continuous image GANs and pixel-based ones. Apart from that, we explore several exciting properties of the INR-based decoders, like out-of-the-box superresolution, meaningful image-space interpolation, accelerated inference of low-resolution images, an ability to extrapolate outside of image boundaries, and strong geometric prior. The project page is located at https://universome.github.io/inr-gan.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Generation | CIFAR10 32x32 (test) | FID8.62 | 154 | |
| Unconditional Image Generation | FFHQ 256x256 | FID9.57 | 64 | |
| Image Generation | CelebA-HQ (test) | FID10.3 | 42 | |
| Image Generation | FFHQ 256x256 (test) | FID4.95 | 30 | |
| Image Generation | FFHQ (test) | FID9.57 | 21 | |
| Unconditional Image Generation | LSUN Church 256x256 | FID5.09 | 14 | |
| Unconditional image synthesis | FFHQ 1024 | FID16.32 | 12 | |
| Image Generation | AFHQ cat v2 | FID7.35 | 9 | |
| Image Generation | AFHQ Dog v2 (test) | FID23.93 | 9 | |
| Image Generation | CelebA-HQ | Precision68.2 | 3 |