Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MLIC++: Linear Complexity Multi-Reference Entropy Modeling for Learned Image Compression

About

The latent representation in learned image compression encompasses channel-wise, local spatial, and global spatial correlations, which are essential for the entropy model to capture for conditional entropy minimization. Efficiently capturing these contexts within a single entropy model, especially in high-resolution image coding, presents a challenge due to the computational complexity of existing global context modules. To address this challenge, we propose the Linear Complexity Multi-Reference Entropy Model (MEM$^{++}$). Specifically, the latent representation is partitioned into multiple slices. For channel-wise contexts, previously compressed slices serve as the context for compressing a particular slice. For local contexts, we introduce a shifted-window-based checkerboard attention module. This module ensures linear complexity without sacrificing performance. For global contexts, we propose a linear complexity attention mechanism. It captures global correlations by decomposing the softmax operation, enabling the implicit computation of attention maps from previously decoded slices. Using MEM$^{++}$ as the entropy model, we develop the image compression method MLIC$^{++}$. Extensive experimental results demonstrate that MLIC$^{++}$ achieves state-of-the-art performance, reducing BD-rate by $13.39\%$ on the Kodak dataset compared to VTM-17.0 in Peak Signal-to-Noise Ratio (PSNR). Furthermore, MLIC$^{++}$ exhibits linear computational complexity and memory consumption with resolution, making it highly suitable for high-resolution image coding. Code and pre-trained models are available at https://github.com/JiangWeibeta/MLIC. Training dataset is available at https://huggingface.co/datasets/Whiteboat/MLIC-Train-100K.

Wei Jiang, Jiayu Yang, Yongqi Zhai, Feng Gao, Ronggang Wang• 2023

Related benchmarks

TaskDatasetResultRank
Image CompressionKodak
BD-Rate (PSNR)-15.09
58
Image CompressionTecnick
BD-Rate (PSNR)-18.68
44
Image CompressionCLIC
BD-Rate (PSNR)-14.45
37
Image CompressionCLIC Professional (val)
BD-Rate (PSNR)-16.84
34
Image CompressionKodak (test)--
32
Lossy Image CompressionWind turbine image dataset full-resolution
BD-rate (PSNR)7.54
14
Image CompressionT2 dataset
File Size (bytes)4.26e+3
8
Image CompressionT1
File Size (bytes)4.17e+3
8
Image CompressionCLIC (test)--
8
Image CompressionTecnick original (test)
BD-Rate (MS-SSIM)-53.14
7
Showing 10 of 12 rows

Other info

Code

Follow for update