SwinIR: Image Restoration Using Swin Transformer
About
Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by $\textbf{up to 0.14$\sim$0.45dB}$, while the total number of parameters can be reduced by $\textbf{up to 67%}$.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Object Detection | COCO 2017 (val) | -- | 2454 | |
| Instance Segmentation | COCO 2017 (val) | APm0.102 | 1144 | |
| Semantic segmentation | ADE20K | mIoU14.3 | 936 | |
| Super-Resolution | Set5 | PSNR38.42 | 751 | |
| Image Super-resolution | Manga109 | PSNR39.92 | 656 | |
| Super-Resolution | Urban100 | PSNR33.81 | 603 | |
| Super-Resolution | Set14 | PSNR34.46 | 586 | |
| Image Deblurring | GoPro (test) | PSNR29.88 | 585 | |
| Image Super-resolution | Set5 (test) | PSNR38.42 | 544 | |
| Image Super-resolution | Set5 | PSNR38.42 | 507 |