Towards Extreme Image Compression with Latent Feature Guidance and Diffusion Prior
About
Image compression at extremely low bitrates (below 0.1 bits per pixel (bpp)) is a significant challenge due to substantial information loss. In this work, we propose a novel two-stage extreme image compression framework that exploits the powerful generative capability of pre-trained diffusion models to achieve realistic image reconstruction at extremely low bitrates. In the first stage, we treat the latent representation of images in the diffusion space as guidance, employing a VAE-based compression approach to compress images and initially decode the compressed information into content variables. The second stage leverages pre-trained stable diffusion to reconstruct images under the guidance of content variables. Specifically, we introduce a small control module to inject content information while keeping the stable diffusion model fixed to maintain its generative capability. Furthermore, we design a space alignment loss to force the content variables to align with the diffusion space and provide the necessary constraints for optimization. Extensive experiments demonstrate that our method significantly outperforms state-of-the-art approaches in terms of visual performance at extremely low bitrates. The source code and trained models are available at https://github.com/huai-chang/DiffEIC.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Compression | DIV2K 512 | BD-PSNR13.98 | 90 | |
| Image Compression | Kodak24 512 | PSNR23.63 | 76 | |
| Image Compression | CLIC2020 512x512 (test) | BD-PSNR1.35 | 66 | |
| Image Compression | Tecnick | BD-Rate286.9 | 53 | |
| Image Compression | Kodak (test) | BD-Rate (LPIPS)-37.71 | 35 | |
| Image Compression | CLIC 2020 | BD-rate (DISTS)-36.04 | 34 | |
| Image Compression | Kodak | BD-Rate (DISTS)-33.79 | 25 | |
| Image Compression | Tecnick (test) | BD-rate (LPIPS)-9.96 | 21 | |
| Image Compression | Kodak | Encoding Time (s)0.128 | 20 | |
| Image Compression | DIV2K (test) | BD-Rate (LPIPS)-15.76 | 20 |