An Edit Friendly DDPM Noise Space: Inversion and Manipulations

About

Denoising diffusion probabilistic models (DDPMs) employ a sequence of white Gaussian noise samples to generate an image. In analogy with GANs, those noise maps could be considered as the latent code associated with the generated image. However, this native noise space does not possess a convenient structure, and is thus challenging to work with in editing tasks. Here, we propose an alternative latent noise space for DDPM that enables a wide range of editing operations via simple means, and present an inversion method for extracting these edit-friendly noise maps for any given image (real or synthetically generated). As opposed to the native DDPM noise space, the edit-friendly noise maps do not have a standard normal distribution and are not statistically independent across timesteps. However, they allow perfect reconstruction of any desired image, and simple transformations on them translate into meaningful manipulations of the output image (e.g. shifting, color edits). Moreover, in text-conditional models, fixing those noise maps while changing the text prompt, modifies semantics while retaining structure. We illustrate how this property enables text-based editing of real images via the diverse DDPM sampling scheme (in contrast to the popular non-diverse DDIM inversion). We also show how it can be used within existing diffusion-based editing methods to improve their quality and diversity. Webpage: https://inbarhub.github.io/DDPM_inversion

Inbar Huberman-Spiegelglas, Vladimir Kulikov, Tomer Michaeli• 2023

Related benchmarks

Task	Dataset	Result
Image Editing	PIE-Bench	PSNR27.49	257
Image Editing	PIE-Bench (test)	--	55
Text-Guided Image Editing	PIE-Bench	CLIP Similarity (Whole)21.51	40
Image Editing	ImageNet-R TI2I	CLIP Score27.01	24
Image Editing	PIE-Bench 1.0 (test)	PSNR24.55	22
Image-to-Image Translation (Appearance Consistency)	LAION Mini	Structure Similarity0.94	20
Image-to-Image Translation (Appearance Divergence)	LAION Mini	Structure Similarity94	20
Layout-free HOI editing	IEBench	Editability-Identity0.438	14
Image Tone Adjustment	AnyEdit PIE-Bench	SSIM0.702	12
Text-guided Image-to-Image Translation	ImageNet-R TI2I modified	CLIP Similarity32	10

Showing 10 of 20 rows

Other info

Follow for update

@wizwand_team Discord