Fine-Grained Motion Compression and Selective Temporal Fusion for Neural B-Frame Video Coding
About
With the remarkable progress in neural P-frame video coding, neural B-frame coding has recently emerged as a critical research direction. However, most existing neural B-frame codecs directly adopt P-frame coding tools without adequately addressing the unique challenges of B-frame compression, leading to suboptimal performance. To bridge this gap, we propose novel enhancements for motion compression and temporal fusion for neural B-frame coding. First, we design a fine-grained motion compression method. This method incorporates an interactive dual-branch motion auto-encoder with per-branch adaptive quantization steps, which enables fine-grained compression of bi-directional motion vectors while accommodating their asymmetric bitrate allocation and reconstruction quality requirements. Furthermore, this method involves an interactive motion entropy model that exploits correlations between bi-directional motion latent representations by interactively leveraging partitioned latent segments as directional priors. Second, we propose a selective temporal fusion method that predicts bi-directional fusion weights to achieve discriminative utilization of bi-directional multi-scale temporal contexts with varying qualities. Additionally, this method introduces a hyperprior-based implicit alignment mechanism for contextual entropy modeling. By treating the hyperprior as a surrogate for the contextual latent representation, this mechanism implicitly mitigates the misalignment in the fused bi-directional temporal priors. Extensive experiments demonstrate that our proposed codec achieves an average BD-rate reduction of approximately 10% compared to the state-of-the-art neural B-frame codec, DCVC-B, and delivers comparable or even superior compression performance to the H.266/VVC reference software under random-access configurations.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Video Compression | HEVC Class D | BD-Rate-45.4 | 74 | |
| Video Compression | MCL-JCV | BD-Rate (PSNR)-30.5 | 60 | |
| Video Compression | HEVC Class E | BD-Rate (%)-50.8 | 53 | |
| Video Compression | UVG | BD-Rate (PSNR)-31.4 | 49 | |
| Video Compression | UVG (test) | BD-Bitrate (PSNR)-27.4 | 30 | |
| Video Compression | MCL-JCV (test) | BD-Bitrate (PSNR)-27.5 | 26 | |
| Video Compression | HEVC Class B (test) | BD-Bitrate (PSNR)-34.7 | 25 | |
| Video Compression | HEVC ClassB | -- | 17 | |
| Video Compression | HEVC Class D (test) | BD-Rate (PSNR)-43.5 | 16 | |
| Video Compression | HEVC Class C (test) | BD-Rate (PSNR)-28.4 | 16 |