Adaptively Distilled ControlNet: Accelerated Training and Superior Sampling for Medical Image Synthesis

About

Medical image annotation is constrained by privacy concerns and labor-intensive labeling, significantly limiting the performance and generalization of segmentation models. While mask-controllable diffusion models excel in synthesis, they struggle with precise lesion-mask alignment. We propose \textbf{Adaptively Distilled ControlNet}, a task-agnostic framework that accelerates training and optimization through dual-model distillation. Specifically, during training, a teacher model, conditioned on mask-image pairs, regularizes a mask-only student model via predicted noise alignment in parameter space, further enhanced by adaptive regularization based on lesion-background ratios. During sampling, only the student model is used, enabling privacy-preserving medical image generation. Comprehensive evaluations on two distinct medical datasets demonstrate state-of-the-art performance: TransUNet improves mDice/mIoU by 2.4%/4.2% on KiTS19, while SANet achieves 2.6%/3.5% gains on Polyps, highlighting its effectiveness and superiority. Code is available at GitHub.

Kunpeng Qiu, Zhiying Zhou, Yongxin Guo• 2025

Related benchmarks

Task	Dataset	Result
Polyp Segmentation	Kvasir	Dice Score92	143
Polyp Segmentation	ETIS	Dice Score80.8	122
Polyp Segmentation	ColonDB	mDice82	79
Polyp Segmentation	ClinicDB	mDice0.93	64
Polyp Segmentation	EndoScene	mDice90.3	61
Polyp Segmentation	Overall Combined Datasets	mDice0.844	21
Segmentation	Polyp	mIoU72.7	16
Segmentation	KiTS 19	mDice97.9	11
Tumor Segmentation	KiTS19 (test)	mDice97.9	10
Medical Image Synthesis	Polyps	FID66.587	9

Showing 10 of 11 rows

Other info

Code

Follow for update

@wizwand_team Discord