Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

A Fast Transformer-based General-Purpose Lossless Compressor

About

Deep-learning-based compressor has received interests recently due to much improved compression ratio. However, modern approaches suffer from long execution time. To ease this problem, this paper targets on cutting down the execution time of deep-learning-based compressors. Building history-dependencies sequentially (e.g., recurrent neural networks) is responsible for long inference latency. Instead, we introduce transformer into deep learning compressors to build history-dependencies in parallel. However, existing transformer is too heavy in computation and incompatible to compression tasks. This paper proposes a fast general-purpose lossless compressor, TRACE, by designing a compression-friendly structure based on a single-layer transformer. We first design a new metric to advise the selection part of compression model structures. Byte-grouping and Shared-ffn schemes are further proposed to fully utilize the capacity of the single-layer transformer. These features allow TRACE to achieve competitive compression ratio and a much faster speed. In addition, we further accelerate the compression procedure by designing a controller to reduce the parameter updating overhead. Experiments show that TRACE achieves an overall $\sim$3x speedup while keeps a comparable compression ratio to the state-of-the-art compressors. The source code for TRACE and links to the datasets are available at https://github.com/mynotwo/A-Fast-Transformer-based-General-Purpose-LosslessCompressor.

Yu Mao, Yufei Cui, Tei-Wei Kuo, Chun Jason Xue• 2022

Related benchmarks

TaskDatasetResultRank
Lossless Data CompressionLJSpeech
Compression Ratio1.783
11
Lossless Data CompressionCESM float
Compression Ratio2.696
11
Lossless Data CompressionTestImages image
Compression Ratio2.29
11
Lossless Data CompressionEnwik9 text
Compression Ratio5.142
11
Lossless Data CompressionUVG video
Compression Ratio2.336
11
Lossless Data CompressionSilesia heterogeneous
Compression Ratio4.517
11
Lossless Data CompressionDNACorpus genome
Compression Ratio4.278
11
Lossless Data CompressionSilesia
Compression Throughput2.76e+3
7
Showing 8 of 8 rows

Other info

Follow for update