Error-Propagation-Free Learned Video Compression With Dual-Domain Progressive Temporal Alignment

About

Existing frameworks for learned video compression suffer from a dilemma between inaccurate temporal alignment and error propagation for motion estimation and compensation (ME/MC). The separate-transform framework employs distinct transforms for intra-frame and inter-frame compression to yield impressive rate-distortion (R-D) performance but causes evident error propagation, while the unified-transform framework eliminates error propagation via shared transforms but is inferior in ME/MC in shared latent domains. To address this limitation, in this paper, we propose a novel unifiedtransform framework with dual-domain progressive temporal alignment and quality-conditioned mixture-of-expert (QCMoE) to enable quality-consistent and error-propagation-free streaming for learned video compression. Specifically, we propose dualdomain progressive temporal alignment for ME/MC that leverages coarse pixel-domain alignment and refined latent-domain alignment to significantly enhance temporal context modeling in a coarse-to-fine fashion. The coarse pixel-domain alignment efficiently handles simple motion patterns with optical flow estimated from a single reference frame, while the refined latent-domain alignment develops a Flow-Guided Deformable Transformer (FGDT) over latents from multiple reference frames to achieve long-term motion refinement (LTMR) for complex motion patterns. Furthermore, we design a QCMoE module for continuous bit-rate adaptation that dynamically assigns different experts to adjust quantization steps per pixel based on target quality and content rather than relies on a single quantization step. QCMoE allows continuous and consistent rate control with appealing R-D performance. Experimental results show that the proposed method achieves competitive R-D performance compared with the state-of-the-arts, while successfully eliminating error propagation.

Han Li, Shaohui Li, Wenrui Dai, Chenglin Li, Xinlong Pan, Haipeng Wang, Junni Zou, Hongkai Xiong• 2025

Related benchmarks

Task	Dataset	Result
Video Compression	MCL-JCV	BD-Rate (PSNR)-10.4	92
Video Compression	HEVC Class D	BD-Rate-16.6	74
Video Compression	HEVC Class B	BD-Rate (%)-7.7	63
Video Compression	HEVC Class E	BD-Rate (%)-16.4	60
Video Compression	UVG	BD-Rate (PSNR)-24.4	55
Video Compression	HEVC ClassB	BD-Rate (MS-SSIM)-64.5	17
Video Compression	1080p videos	Encoding Latency (s)0.729	14
Video Compression	HEVC Class C	BD-rate (PSNR)4.9	14
Video Compression	HEVC E	BD-Rate (MS-SSIM)-63.2	7
Video Compression	HEVC Class C	BD-Rate (MS-SSIM)-50.1	7

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord