Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

N-Gram in Swin Transformers for Efficient Lightweight Image Super-Resolution

About

While some studies have proven that Swin Transformer (Swin) with window self-attention (WSA) is suitable for single image super-resolution (SR), the plain WSA ignores the broad regions when reconstructing high-resolution images due to a limited receptive field. In addition, many deep learning SR methods suffer from intensive computations. To address these problems, we introduce the N-Gram context to the low-level vision with Transformers for the first time. We define N-Gram as neighboring local windows in Swin, which differs from text analysis that views N-Gram as consecutive characters or words. N-Grams interact with each other by sliding-WSA, expanding the regions seen to restore degraded pixels. Using the N-Gram context, we propose NGswin, an efficient SR network with SCDP bottleneck taking multi-scale outputs of the hierarchical encoder. Experimental results show that NGswin achieves competitive performance while maintaining an efficient structure when compared with previous leading methods. Moreover, we also improve other Swin-based SR methods with the N-Gram context, thereby building an enhanced model: SwinIR-NG. Our improved SwinIR-NG outperforms the current best lightweight SR approaches and establishes state-of-the-art results. Codes are available at https://github.com/rami0205/NGramSwin.

Haram Choi, Jeongmin Lee, Jihoon Yang• 2022

Related benchmarks

TaskDatasetResultRank
Super-ResolutionSet5
PSNR38.17
751
Super-ResolutionUrban100
PSNR32.78
603
Super-ResolutionSet14
PSNR33.94
586
Super-ResolutionBSD100
PSNR32.31
313
Super-ResolutionManga109
PSNR39.2
298
Super-ResolutionSet14 (test)
PSNR28.83
246
Image Super-resolutionBSD100 (test)
PSNR27.71
216
Super-ResolutionUrban100 (test)
PSNR26.54
205
Super-ResolutionSet5 (test)
PSNR32.44
184
Super-ResolutionSet14 4x (test)
PSNR28.83
117
Showing 10 of 30 rows

Other info

Follow for update