Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Variational image compression with a scale hyperprior

About

We describe an end-to-end trainable model for image compression based on variational autoencoders. The model incorporates a hyperprior to effectively capture spatial dependencies in the latent representation. This hyperprior relates to side information, a concept universal to virtually all modern image codecs, but largely unexplored in image compression using artificial neural networks (ANNs). Unlike existing autoencoder compression methods, our model trains a complex prior jointly with the underlying autoencoder. We demonstrate that this model leads to state-of-the-art image compression when measuring visual quality using the popular MS-SSIM index, and yields rate-distortion performance surpassing published ANN-based methods when evaluated using a more traditional metric based on squared error (PSNR). Furthermore, we provide a qualitative comparison of models trained for different distortion metrics.

Johannes Ball\'e, David Minnen, Saurabh Singh, Sung Jin Hwang, Nick Johnston• 2018

Related benchmarks

TaskDatasetResultRank
Object DetectionCOCO 2017 (val)--
2454
Instance SegmentationCOCO 2017 (val)--
1144
Image CompressionKodak (test)
BD-Rate40.85
32
Watermark VerificationDiffusionDB (test)
TPR@1%FPR45.4
15
Quality PreservationMS-COCO (test)
FID44.91
13
Quality PreservationSD-Prompts (test)
FID53.21
13
Quality PreservationDiffusionDB (test)
FID52.71
13
Watermark RemovalDiffusionDB-2M
LPIPS0.614
9
Image CompressionHRSOD (val)
BD-rate150.6
8
ROI Image CompressionCOCO 2017 (val)
BD-rate136
8
Showing 10 of 13 rows

Other info

Follow for update