Enriched CNN-Transformer Feature Aggregation Networks for Super-Resolution
About
Recent transformer-based super-resolution (SR) methods have achieved promising results against conventional CNN-based methods. However, these approaches suffer from essential shortsightedness created by only utilizing the standard self-attention-based reasoning. In this paper, we introduce an effective hybrid SR network to aggregate enriched features, including local features from CNNs and long-range multi-scale dependencies captured by transformers. Specifically, our network comprises transformer and convolutional branches, which synergetically complement each representation during the restoration procedure. Furthermore, we propose a cross-scale token attention module, allowing the transformer branch to exploit the informative relationships among tokens across different scales efficiently. Our proposed method achieves state-of-the-art SR results on numerous benchmark datasets.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Super-Resolution | Set14 4x (test) | PSNR29.27 | 117 | |
| Super-Resolution | Set5 x2 (test) | PSNR38.53 | 95 | |
| Image Super-resolution | Urban100 x4 (test) | PSNR27.92 | 90 | |
| Super-Resolution | Manga109 4x | PSNR32.44 | 88 | |
| Super-Resolution | Set5 3 (test) | PSNR (dB)35.09 | 87 | |
| Image Super-resolution | Urban100 x2 (test) | PSNR34.25 | 72 | |
| Image Super-resolution | Urban100 x3 (test) | PSNR30.26 | 58 | |
| Super-Resolution | BSD100 4x (test) | PSNR28 | 56 | |
| Image Super-resolution | Manga109 x2 (test) | PSNR40.11 | 52 | |
| Super-Resolution | Set14 2x | PSNR34.68 | 51 |