Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Error-Propagation-Free Learned Video Compression With Dual-Domain Progressive Temporal Alignment

About

Existing frameworks for learned video compression suffer from a dilemma between inaccurate temporal alignment and error propagation for motion estimation and compensation (ME/MC). The separate-transform framework employs distinct transforms for intra-frame and inter-frame compression to yield impressive rate-distortion (R-D) performance but causes evident error propagation, while the unified-transform framework eliminates error propagation via shared transforms but is inferior in ME/MC in shared latent domains. To address this limitation, in this paper, we propose a novel unifiedtransform framework with dual-domain progressive temporal alignment and quality-conditioned mixture-of-expert (QCMoE) to enable quality-consistent and error-propagation-free streaming for learned video compression. Specifically, we propose dualdomain progressive temporal alignment for ME/MC that leverages coarse pixel-domain alignment and refined latent-domain alignment to significantly enhance temporal context modeling in a coarse-to-fine fashion. The coarse pixel-domain alignment efficiently handles simple motion patterns with optical flow estimated from a single reference frame, while the refined latent-domain alignment develops a Flow-Guided Deformable Transformer (FGDT) over latents from multiple reference frames to achieve long-term motion refinement (LTMR) for complex motion patterns. Furthermore, we design a QCMoE module for continuous bit-rate adaptation that dynamically assigns different experts to adjust quantization steps per pixel based on target quality and content rather than relies on a single quantization step. QCMoE allows continuous and consistent rate control with appealing R-D performance. Experimental results show that the proposed method achieves competitive R-D performance compared with the state-of-the-arts, while successfully eliminating error propagation.

Han Li, Shaohui Li, Wenrui Dai, Chenglin Li, Xinlong Pan, Haipeng Wang, Junni Zou, Hongkai Xiong• 2025

Related benchmarks

TaskDatasetResultRank
Video CompressionHEVC Class D
BD-Rate-16.6
74
Video CompressionMCL-JCV
BD-Rate (PSNR)-10.4
60
Video CompressionHEVC Class B
BD-Rate (%)-7.7
58
Video CompressionHEVC Class E
BD-Rate (%)-16.4
53
Video CompressionUVG
BD-Rate (PSNR)-24.4
49
Video CompressionHEVC ClassB
BD-Rate (MS-SSIM)-64.5
17
Video Compression1080p videos
Encoding Latency (s)0.729
14
Video CompressionHEVC Class C
BD-rate (PSNR)4.9
14
Video CompressionHEVC E
BD-Rate (MS-SSIM)-63.2
7
Video CompressionHEVC Class C
BD-Rate (MS-SSIM)-50.1
7
Showing 10 of 11 rows

Other info

Follow for update