Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Factorized Diffusion Architectures for Unsupervised Image Generation and Segmentation

About

We develop a neural network architecture which, trained in an unsupervised manner as a denoising diffusion model, simultaneously learns to both generate and segment images. Learning is driven entirely by the denoising diffusion objective, without any annotation or prior knowledge about regions during training. A computational bottleneck, built into the neural architecture, encourages the denoising network to partition an input into regions, denoise them in parallel, and combine the results. Our trained model generates both synthetic images and, by simple examination of its internal predicted partitions, a semantic segmentation of those images. Without any finetuning, we directly apply our unsupervised model to the downstream task of segmenting real images via noising and subsequently denoising them. Experiments demonstrate that our model achieves accurate unsupervised image segmentation and high-quality synthetic image generation across multiple datasets.

Xin Yuan, Michael Maire• 2023

Related benchmarks

TaskDatasetResultRank
Image GenerationFFHQ
FID10.79
91
Image GenerationImageNet
FID6.54
68
Semantic segmentationCelebAMask-HQ (test)--
9
Image GenerationFlower
FID11.5
7
Image SegmentationCUB
mIoU56.1
6
Image GenerationCUB
FID10.28
6
Image SegmentationFlower
Accuracy90.1
5
Mask GenerationFlower
Accuracy92.7
4
Mask GenerationCUB
Accuracy91.4
4
Mask GenerationFFHQ
Accuracy90.7
4
Showing 10 of 12 rows

Other info

Follow for update