MLIC++: Linear Complexity Multi-Reference Entropy Modeling for Learned Image Compression
About
The latent representation in learned image compression encompasses channel-wise, local spatial, and global spatial correlations, which are essential for the entropy model to capture for conditional entropy minimization. Efficiently capturing these contexts within a single entropy model, especially in high-resolution image coding, presents a challenge due to the computational complexity of existing global context modules. To address this challenge, we propose the Linear Complexity Multi-Reference Entropy Model (MEM$^{++}$). Specifically, the latent representation is partitioned into multiple slices. For channel-wise contexts, previously compressed slices serve as the context for compressing a particular slice. For local contexts, we introduce a shifted-window-based checkerboard attention module. This module ensures linear complexity without sacrificing performance. For global contexts, we propose a linear complexity attention mechanism. It captures global correlations by decomposing the softmax operation, enabling the implicit computation of attention maps from previously decoded slices. Using MEM$^{++}$ as the entropy model, we develop the image compression method MLIC$^{++}$. Extensive experimental results demonstrate that MLIC$^{++}$ achieves state-of-the-art performance, reducing BD-rate by $13.39\%$ on the Kodak dataset compared to VTM-17.0 in Peak Signal-to-Noise Ratio (PSNR). Furthermore, MLIC$^{++}$ exhibits linear computational complexity and memory consumption with resolution, making it highly suitable for high-resolution image coding. Code and pre-trained models are available at https://github.com/JiangWeibeta/MLIC. Training dataset is available at https://huggingface.co/datasets/Whiteboat/MLIC-Train-100K.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Compression | Kodak | BD-Rate (PSNR)-15.02 | 50 | |
| Image Compression | Tecnick | BD-Rate (PSNR)-17.59 | 36 | |
| Image Compression | Kodak (test) | -- | 32 | |
| Image Compression | CLIC Professional (val) | BD-Rate (PSNR)-13.08 | 26 | |
| Image Compression | CLIC | BD-Rate (PSNR)-14.45 | 16 | |
| Image Compression | CLIC (test) | -- | 8 | |
| Image Compression | Tecnick original (test) | BD-Rate (MS-SSIM)-53.14 | 7 |