Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Gated Differential Linear Attention: A Linear-Time Decoder for High-Fidelity Medical Segmentation

About

Medical image segmentation requires models that preserve fine anatomical boundaries while remaining efficient for clinical deployment. While transformers capture long-range dependencies, they suffer from quadratic attention cost and large data requirements, whereas CNNs are compute-friendly yet struggle with global reasoning. Linear attention offers $\mathcal{O}(N)$ scaling, but often exhibits training instability and attention dilution, yielding diffuse maps. We introduce PVT-GDLA, a decoder-centric Transformer that restores sharp, long-range dependencies at linear time. Its core, Gated Differential Linear Attention (GDLA), computes two kernelized attention paths on complementary query/key subspaces and subtracts them with a learnable, channel-wise scale to cancel common-mode noise and amplify relevant context. A lightweight, head-specific gate injects nonlinearity and input-adaptive sparsity, mitigating attention sink, and a parallel local token-mixing branch with depthwise convolution strengthens neighboring-token interactions, improving boundary fidelity, all while retaining $\mathcal{O}(N)$ complexity and low parameter overhead. Coupled with a pretrained Pyramid Vision Transformer (PVT) encoder, PVT-GDLA achieves state-of-the-art accuracy across CT, MRI, ultrasound, and dermoscopy benchmarks under equal training budgets, with comparable parameters but lower FLOPs than CNN-, Transformer-, hybrid-, and linear-attention baselines. PVT-GDLA provides a practical path to fast, scalable, high-fidelity medical segmentation in clinical environments and other resource-constrained settings.

Hongbo Zheng, Afshin Bozorgpour, Dorit Merhof, Minjia Zhang• 2026

Related benchmarks

TaskDatasetResultRank
Medical Image SegmentationBUSI
Dice Score80.54
91
Skin Lesion SegmentationPH2
DIC0.9559
70
Cardiac SegmentationACDC
RV Score91.3
68
Medical Image SegmentationSynapse
Average DSC85.32
52
Skin Lesion SegmentationHAM10000
Dice Coefficient95.01
12
Showing 5 of 5 rows

Other info

Follow for update