Image Super-Resolution via Iterative Refinement
About
We present SR3, an approach to image Super-Resolution via Repeated Refinement. SR3 adapts denoising diffusion probabilistic models to conditional image generation and performs super-resolution through a stochastic denoising process. Inference starts with pure Gaussian noise and iteratively refines the noisy output using a U-Net model trained on denoising at various noise levels. SR3 exhibits strong performance on super-resolution tasks at different magnification factors, on faces and natural images. We conduct human evaluation on a standard 8X face super-resolution task on CelebA-HQ, comparing with SOTA GAN methods. SR3 achieves a fool rate close to 50%, suggesting photo-realistic outputs, while GANs do not exceed a fool rate of 34%. We further show the effectiveness of SR3 in cascaded image generation, where generative models are chained with super-resolution models, yielding a competitive FID score of 11.3 on ImageNet.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Super-resolution | Set5 | PSNR36.69 | 507 | |
| Class-conditional Image Generation | ImageNet 256x256 | -- | 441 | |
| Class-conditional Image Generation | ImageNet 256x256 (val) | -- | 293 | |
| Image Generation | ImageNet 256x256 | FID11.3 | 243 | |
| Image Super-resolution | Urban100 | PSNR30.29 | 221 | |
| Image Super-resolution | Manga109 | LPIPS0.0161 | 38 | |
| Image Restoration | Urban100 | PSNR18.9 | 32 | |
| Class-conditional Image Generation | ImageNet (train val) | FID11.3 | 30 | |
| Superresolution | CelebA-HQ (test) | PSNR23.51 | 25 | |
| Image Super-resolution | B100 | PSNR30.41 | 24 |