From Local Windows to Adaptive Candidates via Individualized Exploratory: Rethinking Attention for Image Super-Resolution
About
Single Image Super-Resolution (SISR) is a fundamental computer vision task that aims to reconstruct a high-resolution (HR) image from a low-resolution (LR) input. Transformer-based methods have achieved remarkable performance by modeling long-range dependencies in degraded images. However, their feature-intensive attention computation incurs high computational cost. To improve efficiency, most existing approaches partition images into fixed groups and restrict attention within each group. Such group-wise attention overlooks the inherent asymmetry in token similarities, thereby failing to enable flexible and token-adaptive attention computation. To address this limitation, we propose the Individualized Exploratory Transformer (IET), which introduces a novel Individualized Exploratory Attention (IEA) mechanism that allows each token to adaptively select its own content-aware and independent attention candidates. This token-adaptive and asymmetric design enables more precise information aggregation while maintaining computational efficiency. Extensive experiments on standard SR benchmarks demonstrate that IET achieves state-of-the-art performance under comparable computational complexity.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Super-Resolution | Set5 | PSNR38.44 | 751 | |
| Super-Resolution | Set14 | PSNR34.28 | 586 | |
| Super-Resolution | BSD100 | PSNR32.5 | 313 | |
| Super-Resolution | Manga109 | PSNR39.75 | 298 | |
| Classic Image Super-Resolution | BSDS100 (test) | PSNR32.71 | 78 | |
| Image Super-resolution | Set14 classic (test) | PSNR35.09 | 52 | |
| Classical Image Super-Resolution | Set5 classical SR (test) | PSNR38.74 | 39 | |
| Classical Image Super-Resolution | Urban100 classical SR (test) | PSNR35.07 | 39 | |
| Classical Image Super-Resolution | Manga109 classical SR (test) | PSNR40.61 | 36 |