Manifold Preserving Guided Diffusion
About
Despite the recent advancements, conditional image generation still faces challenges of cost, generalizability, and the need for task-specific training. In this paper, we propose Manifold Preserving Guided Diffusion (MPGD), a training-free conditional generation framework that leverages pretrained diffusion models and off-the-shelf neural networks with minimal additional inference cost for a broad range of tasks. Specifically, we leverage the manifold hypothesis to refine the guided diffusion steps and introduce a shortcut algorithm in the process. We then propose two methods for on-manifold training-free guidance using pre-trained autoencoders and demonstrate that our shortcut inherently preserves the manifolds when applied to latent diffusion models. Our experiments show that MPGD is efficient and effective for solving a variety of conditional generation applications in low-compute settings, and can consistently offer up to 3.8x speed-ups with the same number of diffusion steps while maintaining high sample quality compared to the baselines.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Class-conditional Image Generation | ImageNet | FID239 | 158 | |
| Conditional Image Generation | CIFAR-10 | FID88 | 77 | |
| 4x super-resolution | FFHQ 256x256 | PSNR24.01 | 33 | |
| Super-Resolution (4x) | ImageNet | PSNR23.93 | 30 | |
| Inpaint (box) | ImageNet | PSNR22.76 | 26 | |
| Gaussian deblur | FFHQ 256x256 | PSNR24.42 | 25 | |
| Super-Resolution (4x) | Cats | LPIPS0.09 | 14 | |
| Gaussian Deblur 3 | Cats | LPIPS0.14 | 14 | |
| Gaussian Deblur 12 | Cats | LPIPS0.32 | 14 | |
| Gaussian Deblurring | ImageNet Gaussian Blur sigma=3 | LPIPS0.23 | 14 |