Frequency-aware Neural Representation for Videos
About
Implicit Neural Representations (INRs) have emerged as a promising paradigm for video compression. However, existing INR-based frameworks typically suffer from inherent spectral bias, which favors low-frequency components and leads to over-smoothed reconstructions and suboptimal rate-distortion performance. In this paper, we propose FaNeRV, a Frequency-aware Neural Representation for videos, which explicitly decouples low- and high-frequency components to enable efficient and faithful video reconstruction. FaNeRV introduces a multi-resolution supervision strategy that guides the network to progressively capture global structures and fine-grained textures through staged supervision . To further enhance high-frequency reconstruction, we propose a dynamic high-frequency injection mechanism that adaptively emphasizes challenging regions. In addition, we design a frequency-decomposed network module to improve feature modeling across different spectral bands. Extensive experiments on standard benchmarks demonstrate that FaNeRV significantly outperforms state-of-the-art INR methods and achieves competitive rate-distortion performance against traditional codecs.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Video Compression | UVG | BD-Rate (PSNR)0.00e+0 | 49 | |
| Video Compression | HEVC ClassB | BD-Rate (MS-SSIM)0.00e+0 | 17 | |
| Video Compression | BQTerrace | Encoding Time254 | 7 | |
| Video Regression | HEVC ClassB 12 | Bas34.3 | 6 |