Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

XAttnRes: Cross-Stage Attention Residuals for Medical Image Segmentation

About

In the field of Large Language Models (LLMs), Attention Residuals have recently demonstrated that learned, selective aggregation over all preceding layer outputs can outperform fixed residual connections. We propose Cross-Stage Attention Residuals (XAttnRes), a mechanism that maintains a global feature history pool accumulating both encoder and decoder stage outputs. Through lightweight pseudo-query attention, each stage selectively aggregates from all preceding representations. To bridge the gap between the same-dimensional Transformer layers in LLMs and the multi-scale encoder-decoder stages in segmentation networks, XAttnRes introduces spatial alignment and channel projection steps that handle cross-resolution features with negligible overhead. When added to existing segmentation networks, XAttnRes consistently improves performance across four datasets and three imaging modalities. We further observe that XAttnRes alone, even without skip connections, achieves performance on par with the baseline, suggesting that learned aggregation can recover the inter-stage information flow traditionally provided by predetermined connections.

Xinyu Liu, Qing Xu, Zhen Chen• 2026

Related benchmarks

TaskDatasetResultRank
Multi-organ SegmentationSynapse multi-organ CT (test)
DSC83.75
95
Polyp SegmentationClinicDB
mDice0.9532
64
2D SegmentationISIC 2017
Dice Coefficient0.8613
28
2D binary segmentationColonDB
Dice Score92.4
14
Showing 4 of 4 rows

Other info

Follow for update