Masked Diffusion as Self-supervised Representation Learner
About
Denoising diffusion probabilistic models have recently demonstrated state-of-the-art generative performance and have been used as strong pixel-level representation learners. This paper decomposes the interrelation between the generative capability and representation learning ability inherent in diffusion models. We present the masked diffusion model (MDM), a scalable self-supervised representation learner for semantic segmentation, substituting the conventional additive Gaussian noise of traditional diffusion with a masking mechanism. Our proposed approach convincingly surpasses prior benchmarks, demonstrating remarkable advancements in both medical and natural image semantic segmentation tasks, particularly in few-shot scenarios.
Zixuan Pan, Jianxu Chen, Yiyu Shi• 2023
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Cell Segmentation | MoNuSeg | AJI (Object)68.25 | 28 | |
| Semantic segmentation | GLAS | Dice0.9195 | 28 | |
| Semantic segmentation | FFHQ-34 (test) | mIoU0.6034 | 7 | |
| Semantic segmentation | CelebA-19 (test) | mIoU59.57 | 5 |
Showing 4 of 4 rows