Adaptively Distilled ControlNet: Accelerated Training and Superior Sampling for Medical Image Synthesis
About
Medical image annotation is constrained by privacy concerns and labor-intensive labeling, significantly limiting the performance and generalization of segmentation models. While mask-controllable diffusion models excel in synthesis, they struggle with precise lesion-mask alignment. We propose \textbf{Adaptively Distilled ControlNet}, a task-agnostic framework that accelerates training and optimization through dual-model distillation. Specifically, during training, a teacher model, conditioned on mask-image pairs, regularizes a mask-only student model via predicted noise alignment in parameter space, further enhanced by adaptive regularization based on lesion-background ratios. During sampling, only the student model is used, enabling privacy-preserving medical image generation. Comprehensive evaluations on two distinct medical datasets demonstrate state-of-the-art performance: TransUNet improves mDice/mIoU by 2.4%/4.2% on KiTS19, while SANet achieves 2.6%/3.5% gains on Polyps, highlighting its effectiveness and superiority. Code is available at GitHub.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Polyp Segmentation | Kvasir | Dice Score92 | 128 | |
| Polyp Segmentation | ETIS | Dice Score80.8 | 108 | |
| Polyp Segmentation | ColonDB | mDice82 | 74 | |
| Polyp Segmentation | EndoScene | mDice90.3 | 61 | |
| Polyp Segmentation | ClinicDB | mDice0.93 | 50 | |
| Polyp Segmentation | Overall Combined Datasets | mDice0.844 | 21 | |
| Tumor Segmentation | KiTS19 (test) | mDice97.9 | 10 | |
| Medical Image Synthesis | Polyps | FID66.587 | 5 | |
| Medical Image Synthesis | KiTS 19 | FID70.786 | 3 |