Conditional diffusion model with spatial attention and latent embedding for medical image segmentation

About

Diffusion models have been used extensively for high quality image and video generation tasks. In this paper, we propose a novel conditional diffusion model with spatial attention and latent embedding (cDAL) for medical image segmentation. In cDAL, a convolutional neural network (CNN) based discriminator is used at every time-step of the diffusion process to distinguish between the generated labels and the real ones. A spatial attention map is computed based on the features learned by the discriminator to help cDAL generate more accurate segmentation of discriminative regions in an input image. Additionally, we incorporated a random latent embedding into each layer of our model to significantly reduce the number of training and sampling time-steps, thereby making it much faster than other diffusion models for image segmentation. We applied cDAL on 3 publicly available medical image segmentation datasets (MoNuSeg, Chest X-ray and Hippocampus) and observed significant qualitative and quantitative improvements with higher Dice scores and mIoU over the state-of-the-art algorithms. The source code is publicly available at https://github.com/Hejrati/cDAL/.

Behzad Hejrati, Soumyanil Banerjee, Carri Glide-Hurst, Ming Dong• 2025

Related benchmarks

Task	Dataset	Result
Lesion Segmentation	LI	mDice74.6	18
Lesion Segmentation	CP	mDice85.7	18
Lesion Segmentation	BT	mDice80	18
Lesion Segmentation	BL	mDice76.9	18
Lesion Segmentation	WA	mDice78.3	18
Lesion Segmentation	ADC	mDice67.4	18

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord