Efficient Learned Data Compression via Dual-Stream Feature Decoupling
About
While Learned Data Compression (LDC) has achieved superior compression ratios, balancing precise probability modeling with system efficiency remains challenging. Crucially, uniform single-stream architectures struggle to simultaneously capture micro-syntactic and macro-semantic features, necessitating deep serial stacking that exacerbates latency. Compounding this, heterogeneous systems are constrained by device speed mismatches, where throughput is capped by Amdahl's Law due to serial processing. To this end, we propose a Dual-Stream Multi-Scale Decoupler that disentangles local and global contexts to replace deep serial processing with shallow parallel streams, and incorporate a Hierarchical Gated Refiner for adaptive feature refinement and precise probability modeling. Furthermore, we design a Concurrent Stream-Parallel Pipeline, which overcomes systemic bottlenecks to achieve full-pipeline parallelism. Extensive experiments demonstrate that our method achieves state-of-the-art performance in both compression ratio and throughput, while maintaining the lowest latency and memory usage. The code is available at https://github.com/huidong-ma/FADE.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Lossless Data Compression | Enwik9 text | Compression Ratio6.288 | 11 | |
| Lossless Data Compression | LJSpeech | Compression Ratio1.88 | 11 | |
| Lossless Data Compression | TestImages image | Compression Ratio2.402 | 11 | |
| Lossless Data Compression | UVG video | Compression Ratio2.603 | 11 | |
| Lossless Data Compression | CESM float | Compression Ratio2.939 | 11 | |
| Lossless Data Compression | DNACorpus genome | Compression Ratio4.503 | 11 | |
| Lossless Data Compression | Silesia heterogeneous | Compression Ratio5.4 | 11 | |
| Lossless Data Compression | Silesia | Compression Throughput4.57e+3 | 7 |