Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

StableCodec: Taming One-Step Diffusion for Extreme Image Compression

About

Diffusion-based image compression has shown remarkable potential for achieving ultra-low bitrate coding (less than 0.05 bits per pixel) with high realism, by leveraging the generative priors of large pre-trained text-to-image diffusion models. However, current approaches require a large number of denoising steps at the decoder to generate realistic results under extreme bitrate constraints, limiting their application in real-time compression scenarios. Additionally, these methods often sacrifice reconstruction fidelity, as diffusion models typically fail to guarantee pixel-level consistency. To address these challenges, we introduce StableCodec, which enables one-step diffusion for high-fidelity and high-realism extreme image compression with improved coding efficiency. To achieve ultra-low bitrates, we first develop an efficient Deep Compression Latent Codec to transmit a noisy latent representation for a single-step denoising process. We then propose a Dual-Branch Coding Structure, consisting of a pair of auxiliary encoder and decoder, to enhance reconstruction fidelity. Furthermore, we adopt end-to-end optimization with joint bitrate and pixel-level constraints. Extensive experiments on the CLIC 2020, DIV2K, and Kodak dataset demonstrate that StableCodec outperforms existing methods in terms of FID, KID and DISTS by a significant margin, even at bitrates as low as 0.005 bits per pixel, while maintaining strong fidelity. Additionally, StableCodec achieves inference speeds comparable to mainstream transform coding schemes. All source code are available at https://github.com/LuizScarlet/StableCodec.

Tianyu Zhang, Xin Luo, Li Li, Dong Liu• 2025

Related benchmarks

TaskDatasetResultRank
Image CompressionDIV2K 512
BD-PSNR44.03
90
Image CompressionKodak24 512
PSNR22.69
76
Image CompressionCLIC2020 512x512 (test)
BD-PSNR4.68
66
Speech ReconstructionLibrispeech (test-clean)
UT MOS4.23
59
Image ReconstructionKodak (test)--
33
Image CompressionKodak
BD-Rate (DISTS)-70.48
17
Image CompressionCLIC 2020
BD-rate (LPIPS)-80.21
13
Image CompressionKodak24 512x512 (test)
BD-PSNR1.58
13
Image CompressionDIV2K
BD-Rate (LPIPS)-79.02
11
Image CompressionKodak24 (test)
PSNR19.12
8
Showing 10 of 13 rows

Other info

Follow for update