Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Dual Aggregation Transformer for Image Super-Resolution

About

Transformer has recently gained considerable popularity in low-level vision tasks, including image super-resolution (SR). These networks utilize self-attention along different dimensions, spatial or channel, and achieve impressive performance. This inspires us to combine the two dimensions in Transformer for a more powerful representation capability. Based on the above idea, we propose a novel Transformer model, Dual Aggregation Transformer (DAT), for image SR. Our DAT aggregates features across spatial and channel dimensions, in the inter-block and intra-block dual manner. Specifically, we alternately apply spatial and channel self-attention in consecutive Transformer blocks. The alternate strategy enables DAT to capture the global context and realize inter-block feature aggregation. Furthermore, we propose the adaptive interaction module (AIM) and the spatial-gate feed-forward network (SGFN) to achieve intra-block feature aggregation. AIM complements two self-attention mechanisms from corresponding dimensions. Meanwhile, SGFN introduces additional non-linear spatial information in the feed-forward network. Extensive experiments show that our DAT surpasses current methods. Code and models are obtainable at https://github.com/zhengchen1999/DAT.

Zheng Chen, Yulun Zhang, Jinjin Gu, Linghe Kong, Xiaokang Yang, Fisher Yu• 2023

Related benchmarks

TaskDatasetResultRank
Image Super-resolutionManga109
PSNR40.33
656
Image Super-resolutionSet5 (test)
PSNR38.58
544
Single Image Super-ResolutionUrban100
PSNR34.37
500
Image Super-resolutionSet14
PSNR34.81
329
Image Super-resolutionUrban100
PSNR34.37
221
Super-ResolutionSet5 x2
PSNR38.58
134
Super-ResolutionSet14 4x (test)
PSNR29.23
117
Super-ResolutionSet5 x2 (test)
PSNR38.63
95
Image Super-resolutionUrban100 x4 (test)
PSNR27.87
90
Image Super-resolutionBSDS100
PSNR32.61
85
Showing 10 of 40 rows

Other info

Code

Follow for update