VM-UNET-V2 Rethinking Vision Mamba UNet for Medical Image Segmentation

About

In the field of medical image segmentation, models based on both CNN and Transformer have been thoroughly investigated. However, CNNs have limited modeling capabilities for long-range dependencies, making it challenging to exploit the semantic information within images fully. On the other hand, the quadratic computational complexity poses a challenge for Transformers. Recently, State Space Models (SSMs), such as Mamba, have been recognized as a promising method. They not only demonstrate superior performance in modeling long-range interactions, but also preserve a linear computational complexity. Inspired by the Mamba architecture, We proposed Vison Mamba-UNetV2, the Visual State Space (VSS) Block is introduced to capture extensive contextual information, the Semantics and Detail Infusion (SDI) is introduced to augment the infusion of low-level and high-level features. We conduct comprehensive experiments on the ISIC17, ISIC18, CVC-300, CVC-ClinicDB, Kvasir, CVC-ColonDB and ETIS-LaribPolypDB public datasets. The results indicate that VM-UNetV2 exhibits competitive performance in medical image segmentation tasks. Our code is available at https://github.com/nobodyplayer1/VM-UNetV2.

Mingya Zhang, Yue Yu, Limei Gu, Tingsheng Lin, Xianping Tao• 2024

Related benchmarks

Task	Dataset	Result
Medical Image Segmentation	ISIC 2018	--	193
Cardiac Segmentation	ACDC (test)	Avg Dice91.28	162
Skin Lesion Segmentation	ISIC 2018 (test)	Dice Score89.51	143
Polyp Segmentation	ETIS	Dice Score72.56	138
Skin Lesion Segmentation	ISIC 2018	Dice Coefficient89.73	119
Polyp Segmentation	Kvasir-SEG (test)	mIoU0.853	116
Medical Image Segmentation	ISIC 2017	--	102
Polyp Segmentation	ColonDB	mDice76.62	96
Skin Lesion Segmentation	PH2	DIC0.8947	92
Skin Lesion Segmentation	PH2 (test)	DSC92.07	71

Showing 10 of 25 rows

Other info

Follow for update

@wizwand_team Discord