Self-Guided Diffusion Models

About

Diffusion models have demonstrated remarkable progress in image generation quality, especially when guidance is used to control the generative process. However, guidance requires a large amount of image-annotation pairs for training and is thus dependent on their availability, correctness and unbiasedness. In this paper, we eliminate the need for such annotation by instead leveraging the flexibility of self-supervision signals to design a framework for self-guided diffusion models. By leveraging a feature extraction function and a self-annotation function, our method provides guidance signals at various image granularities: from the level of holistic images to object boxes and even segmentation masks. Our experiments on single-label and multi-label image datasets demonstrate that self-labeled guidance always outperforms diffusion models without guidance and may even surpass guidance based on ground-truth labels, especially on unbalanced data. When equipped with self-supervised box or mask proposals, our method further generates visually diverse yet semantically consistent images, without the need for any class, box, or segment label annotation. Self-guided diffusion is simple, flexible and expected to profit from deployment at scale. Source code will be at: https://taohu.me/sgdm/

Vincent Tao Hu, David W Zhang, Yuki M. Asano, Gertjan J. Burghouts, Cees G. M. Snoek• 2022

Related benchmarks

Task	Dataset	Result
Image Generation	ImageNet 64x64	FID12.1	114
Image Generation	LSUN Church 256x256 (test)	FID15.2	61
Image Generation	ImageNet 64x64 (val)	FID12.1	48
Image Generation	ImageNet 32x32	FID7.3	11
Image Generation	Pascal VOC (train)	FID17.1	8
Image Generation	COCO-Stuff (train)	FID12.5	4
Image Generation	COCO Stuff (val)	FID17.7	4
Image Generation	COCO 20K (train val)	FID16	4
Image Generation	ImageNet 32x32 (val)	FID7.3	3
Image Generation	ImageNet-100 256x256 (val)	FID16.1	3

Showing 10 of 10 rows

Other info

Code

Follow for update

@wizwand_team Discord