Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Inverse problems with diffusion models: MAP estimation via mode-seeking loss

About

A pre-trained unconditional diffusion model, combined with posterior sampling or maximum a posteriori (MAP) estimation techniques, can solve arbitrary inverse problems without task-specific training or fine-tuning. However, existing posterior sampling and MAP estimation methods often rely on modeling approximations and can also be computationally demanding. In this work, we propose a new MAP estimation strategy for solving inverse problems with a pre-trained unconditional diffusion model. Specifically, we introduce the variational mode-seeking loss (VML) and show that its minimization at each reverse diffusion step guides the generated sample towards the MAP estimate (modes in practice). VML arises from a novel perspective of minimizing the Kullback-Leibler (KL) divergence between the diffusion posterior $p(\mathbf{x}_0|\mathbf{x}_t)$ and the measurement posterior $p(\mathbf{x}_0|\mathbf{y})$, where $\mathbf{y}$ denotes the measurement. Importantly, for linear inverse problems, VML can be analytically derived without any modeling approximations. Based on further theoretical insights, we propose VML-MAP, an empirically effective algorithm for solving inverse problems via VML minimization, and validate its efficacy in both performance and computational time through extensive experiments on diverse image-restoration tasks across multiple datasets.

Sai Bharath Chandra Gutha, Ricardo Vinuesa, Hossein Azizpour• 2025

Related benchmarks

TaskDatasetResultRank
Gaussian DeblurringFFHQ 256x256 (val)
FID84.88
24
Image InpaintingFFHQ 256x256 (val)
FID52.76
22
4x super-resolutionFFHQ 256x256 (val)
FID52.2
19
Super-Resolution (x4)ImageNet 256 x 256 (val)
FID58.6
17
Face inpainting (Half)CelebA-HQ-256 (test)
LPIPS0.208
12
Uniform deblurringImageNet 256x256 (val)
LPIPS0.367
12
Super-ResolutionImageNet 256
PSNR23.63
12
Box InpaintingImageNet 256 x 256 (val)
FID75.8
11
InpaintingImageNet 256x256 (val)
LPIPS0.262
7
DeblurringImageNet 256
PSNR20.4
7
Showing 10 of 26 rows

Other info

Follow for update