MSLAU-Net: A Hybrid CNN-Transformer Network for Medical Image Segmentation

About

Accurate medical image segmentation allows for the precise delineation of anatomical structures and pathological regions, which is essential for treatment planning, surgical navigation, and disease monitoring. Both CNN-based and Transformer-based methods have achieved remarkable success in medical image segmentation tasks. However, CNN-based methods struggle to effectively capture global contextual information due to the inherent limitations of convolution operations. Meanwhile, Transformer-based methods suffer from insufficient local feature modeling and face challenges related to the high computational complexity caused by the self-attention mechanism. To address these limitations, we propose a novel hybrid CNN-Transformer architecture, named MSLAU-Net, which integrates the strengths of both paradigms. The proposed MSLAU-Net incorporates two key ideas. First, it introduces Multi-Scale Linear Attention, designed to efficiently extract multi-scale features from medical images while modeling long-range dependencies with low computational complexity. Second, it adopts a top-down feature aggregation mechanism, which performs multi-level feature aggregation and restores spatial resolution using a lightweight structure. Extensive experiments conducted on benchmark datasets covering three imaging modalities demonstrate that the proposed MSLAU-Net outperforms other state-of-the-art methods on nearly all evaluation metrics, validating the superiority, effectiveness, and robustness of our approach.Our code is available at https://github.com/Monsoon49/MSLAU-Net.

Libin Lan, Yanxin Li, Xiaojuan Liu, Juan Zhou, Jianxun Zhang, Nannan Huang, Yudong Zhang• 2025

Related benchmarks

Task	Dataset	Result
Medical Image Segmentation	CVC-ClinicDB	Dice Score93.03	118
Medical Image Segmentation	Synapse	Average DSC83.18	77
Medical Image Segmentation	ACDC	DSC (RV)90.38	15

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord