Error-Propagation-Free Learned Video Compression With Dual-Domain Progressive Temporal Alignment
About
Existing frameworks for learned video compression suffer from a dilemma between inaccurate temporal alignment and error propagation for motion estimation and compensation (ME/MC). The separate-transform framework employs distinct transforms for intra-frame and inter-frame compression to yield impressive rate-distortion (R-D) performance but causes evident error propagation, while the unified-transform framework eliminates error propagation via shared transforms but is inferior in ME/MC in shared latent domains. To address this limitation, in this paper, we propose a novel unifiedtransform framework with dual-domain progressive temporal alignment and quality-conditioned mixture-of-expert (QCMoE) to enable quality-consistent and error-propagation-free streaming for learned video compression. Specifically, we propose dualdomain progressive temporal alignment for ME/MC that leverages coarse pixel-domain alignment and refined latent-domain alignment to significantly enhance temporal context modeling in a coarse-to-fine fashion. The coarse pixel-domain alignment efficiently handles simple motion patterns with optical flow estimated from a single reference frame, while the refined latent-domain alignment develops a Flow-Guided Deformable Transformer (FGDT) over latents from multiple reference frames to achieve long-term motion refinement (LTMR) for complex motion patterns. Furthermore, we design a QCMoE module for continuous bit-rate adaptation that dynamically assigns different experts to adjust quantization steps per pixel based on target quality and content rather than relies on a single quantization step. QCMoE allows continuous and consistent rate control with appealing R-D performance. Experimental results show that the proposed method achieves competitive R-D performance compared with the state-of-the-arts, while successfully eliminating error propagation.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Video Compression | HEVC Class D | BD-Rate-16.6 | 74 | |
| Video Compression | MCL-JCV | BD-Rate (PSNR)-10.4 | 60 | |
| Video Compression | HEVC Class B | BD-Rate (%)-7.7 | 58 | |
| Video Compression | HEVC Class E | BD-Rate (%)-16.4 | 53 | |
| Video Compression | UVG | BD-Rate (PSNR)-24.4 | 49 | |
| Video Compression | HEVC ClassB | BD-Rate (MS-SSIM)-64.5 | 17 | |
| Video Compression | 1080p videos | Encoding Latency (s)0.729 | 14 | |
| Video Compression | HEVC Class C | BD-rate (PSNR)4.9 | 14 | |
| Video Compression | HEVC E | BD-Rate (MS-SSIM)-63.2 | 7 | |
| Video Compression | HEVC Class C | BD-Rate (MS-SSIM)-50.1 | 7 |