SITCOM: Step-wise Triple-Consistent Diffusion Sampling for Inverse Problems

About

Diffusion models (DMs) are a class of generative models that allow sampling from a distribution learned over a training set. When applied to solving inverse problems, the reverse sampling steps are modified to approximately sample from a measurement-conditioned distribution. However, these modifications may be unsuitable for certain settings (e.g., presence of measurement noise) and non-linear tasks, as they often struggle to correct errors from earlier steps and generally require a large number of optimization and/or sampling steps. To address these challenges, we state three conditions for achieving measurement-consistent diffusion trajectories. Building on these conditions, we propose a new optimization-based sampling method that not only enforces standard data manifold measurement consistency and forward diffusion consistency, as seen in previous studies, but also incorporates our proposed step-wise and network-regularized backward diffusion consistency that maintains a diffusion trajectory by optimizing over the input of the pre-trained model at every sampling step. By enforcing these conditions (implicitly or explicitly), our sampler requires significantly fewer reverse steps. Therefore, we refer to our method as Step-wise Triple-Consistent Sampling (SITCOM). Compared to SOTA baselines, our experiments across several linear and non-linear tasks (with natural and medical images) demonstrate that SITCOM achieves competitive or superior results in terms of standard similarity metrics and run-time.

Ismail Alkhouri, Shijun Liang, Cheng-Han Huang, Jimmy Dai, Qing Qu, Saiprasad Ravishankar, Rongrong Wang• 2024

Related benchmarks

Task	Dataset	Result
Image Reconstruction	ImageNet 256x256	--	202
Super-Resolution (4x)	ImageNet	PSNR26.519	57
Motion Deblur	FFHQ	PSNR27.52	56
Super-Resolution	FFHQ 256 x 256	PSNR27.35	52
Gaussian Deblurring	FFHQ	PSNR28.775	46
Super-Resolution (4x)	FFHQ	PSNR29.555	42
Gaussian Deblurring	ImageNet	SSIM0.702	41
Gaussian deblur	FFHQ 256 x 256	LPIPS0.266	40
Motion Deblurring	ImageNet	SSIM0.807	36
HDR	FFHQ	PSNR27.628	35

Showing 10 of 82 rows

...

Other info

Follow for update

@wizwand_team Discord