Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DiNAT-IR: Exploring Dilated Neighborhood Attention for High-Quality Image Restoration

About

Transformers, with their self-attention mechanisms for modeling long-range dependencies, have become a dominant paradigm in image restoration tasks. However, the high computational cost of self-attention limits scalability to high-resolution images, making efficiency-quality trade-offs a key research focus. To address this, Restormer employs channel-wise self-attention, which computes attention across channels instead of spatial dimensions. While effective, this approach may overlook localized artifacts that are crucial for high-quality image restoration. To bridge this gap, we explore Dilated Neighborhood Attention (DiNA) as a promising alternative, inspired by its success in high-level vision tasks. DiNA balances global context and local precision by integrating sliding-window attention with mixed dilation factors, effectively expanding the receptive field without excessive overhead. However, our preliminary experiments indicate that directly applying this global-local design to the classic deblurring task hinders accurate visual restoration, primarily due to the constrained global context understanding within local attention. To address this, we introduce a channel-aware module that complements local attention, effectively integrating global context without sacrificing pixel-level precision. The proposed DiNAT-IR, a Transformer-based architecture specifically designed for image restoration, achieves competitive results across multiple benchmarks, offering a high-quality solution for diverse low-level computer vision problems.

Hanzhou Liu, Binghan Li, Chengkai Liu, Mi Lu• 2025

Related benchmarks

TaskDatasetResultRank
DerainingRain100L
PSNR38.93
196
Image Deraining2800 (test)
PSNR33.91
42
Image Deraining1200 (test)
PSNR32.31
35
DerainingRain100H
PSNR31.26
8
DerainingTest100
PSNR31.22
5
Showing 5 of 5 rows

Other info

Follow for update