Stage-wise Distortion-Perception Traversal in Zero-shot Inverse Problems with Diffusion Models
About
The distortion-perception (D-P) tradeoff is a fundamental phenomenon of Bayesian inverse problems, which characterizes the inherent tension between distortion performance and perceptual quality. Enabling flexible traversal of the D-P tradeoff at inference time is crucial for practical applications. Despite the recent success of diffusion models in zero-shot inverse problem solving, efficient and principled strategies for D-P traversal in diffusion-based inverse algorithms remain inadequately characterized. In this paper, we propose a stage-wise framework for realizing D-P traversal using a single diffusion model in zero-shot inverse problems. Our proposed method, termed MAP-RPS, starts with an MAP estimation stage that approximates the MMSE solution and provides a low-distortion initialization, followed by a re-noised posterior sampling stage that progressively improves perceptual quality. We provide theoretical analyses for both stages, establishing the validity and effectiveness of the proposed design. Furthermore, we extend MAP-RPS to the latent space, yielding LMAP-RPS, which enjoys broader applicability by leveraging large-scale pre-trained latent diffusion backbones. Extensive experiments demonstrate that MAP-RPS and LMAP-RPS enable more effective D-P traversal on various tasks, while also exhibiting strong performance as efficient solvers for real-world inverse problems.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 2x Compressed Sensing | MS-COCO (test) | Inference Time (s/image)68 | 11 | |
| 4x super-resolution | MS-COCO (test) | Inference Time (s/image)23 | 11 | |
| Anisotropic Deblurring | MS-COCO | PSNR26.42 | 11 | |
| Anisotropic Deblurring | MS-COCO | Runtime (s/img)44 | 11 | |
| Compressed sensing | MS-COCO | Runtime (s/img)68 | 11 | |
| Compressed Sensing (2x) | MS-COCO | PSNR22.9 | 11 | |
| Inpainting | MS-COCO | PSNR28.14 | 11 | |
| Inpainting | MS-COCO | Runtime (s/img)45 | 11 | |
| Random Inpainting | MS-COCO (test) | Inference Time (s/image)45 | 11 | |
| Super-Resolution | MS-COCO | Runtime (s/img)23 | 11 |