Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

EDPC: Accelerating Lossless Compression via Lightweight Probability Models and Decoupled Parallel Dataflow

About

The explosive growth of multi-source multimedia data has significantly increased the demands for transmission and storage, placing substantial pressure on bandwidth and storage infrastructures. While Autoregressive Compression Models (ACMs) have markedly improved compression efficiency through probabilistic prediction, current approaches remain constrained by two critical limitations: suboptimal compression ratios due to insufficient fine-grained feature extraction during probability modeling, and real-time processing bottlenecks caused by high resource consumption and low compression speeds. To address these challenges, we propose Efficient Dual-path Parallel Compression (EDPC), a hierarchically optimized compression framework that synergistically enhances modeling capability and execution efficiency via coordinated dual-path operations. At the modeling level, we introduce the Information Flow Refinement (IFR) metric grounded in mutual information theory, and design a Multi-path Byte Refinement Block (MBRB) to strengthen cross-byte dependency modeling via heterogeneous feature propagation. At the system level, we develop a Latent Transformation Engine (LTE) for compact high-dimensional feature representation and a Decoupled Pipeline Compression Architecture (DPCA) to eliminate encoding-decoding latency through pipelined parallelization. Experimental results demonstrate that EDPC achieves comprehensive improvements over state-of-the-art methods, including a 2.7x faster compression speed, and a 3.2% higher compression ratio. These advancements establish EDPC as an efficient solution for real-time processing of large-scale multimedia data in bandwidth-constrained scenarios. Our code is available at https://github.com/Magie0/EDPC.

Zeyi Lu, Xiaoxiao Ma, Yujun Huang, Minxiao Chen, Bin Chen, Baoyi An, Shu-Tao Xia• 2025

Related benchmarks

TaskDatasetResultRank
Lossless Data CompressionEnwik9 text
Compression Ratio6.176
11
Lossless Data CompressionLJSpeech
Compression Ratio1.879
11
Lossless Data CompressionTestImages image
Compression Ratio2.392
11
Lossless Data CompressionUVG video
Compression Ratio2.52
11
Lossless Data CompressionCESM float
Compression Ratio2.91
11
Lossless Data CompressionDNACorpus genome
Compression Ratio4.472
11
Lossless Data CompressionSilesia heterogeneous
Compression Ratio5.321
11
Lossless Data CompressionSilesia
Compression Throughput4.39e+3
7
Showing 8 of 8 rows

Other info

Follow for update