Regularization by Texts for Latent Diffusion Inverse Solvers
About
The recent development of diffusion models has led to significant progress in solving inverse problems by leveraging these models as powerful generative priors. However, challenges persist due to the ill-posed nature of such problems, often arising from ambiguities in measurements or intrinsic system symmetries. To address this, here we introduce a novel latent diffusion inverse solver, regularization by text (TReg), inspired by the human ability to resolve visual ambiguities through perceptual biases. TReg integrates textual descriptions of preconceptions about the solution during reverse diffusion sampling, dynamically reinforcing these descriptions through null-text optimization, which we refer to as adaptive negation. Our comprehensive experimental results demonstrate that TReg effectively mitigates ambiguity in inverse problems, improving both accuracy and efficiency.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Super-Resolution | FFHQ 1k | FID49.15 | 23 | |
| Image Denoising | BSD400 (test) | FID94.11 | 16 | |
| Image Deblurring | FFHQ 1k | FID52.07 | 16 | |
| Image Colorization | DIV2K | FID183.3 | 16 | |
| Motion Deblurring | FFHQ 1k | PSNR26.36 | 13 | |
| Image Restoration | FFHQ 512 (test) | VRAM (GB)6.4 | 7 | |
| Gaussian Deblurring | ImageNet 512 (val) | FID56.54 | 7 | |
| Motion Deblurring | FFHQ 512 (val) | FID44.97 | 7 | |
| Motion Deblurring | ImageNet 512 (val) | FID78.75 | 7 | |
| Gaussian Deblurring | FFHQ 512 (val) | FID48.73 | 7 |