Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

HNeRV: A Hybrid Neural Representation for Videos

About

Implicit neural representations store videos as neural networks and have performed well for various vision tasks such as video compression and denoising. With frame index or positional index as input, implicit representations (NeRV, E-NeRV, \etc) reconstruct video from fixed and content-agnostic embeddings. Such embedding largely limits the regression capacity and internal generalization for video interpolation. In this paper, we propose a Hybrid Neural Representation for Videos (HNeRV), where a learnable encoder generates content-adaptive embeddings, which act as the decoder input. Besides the input embedding, we introduce HNeRV blocks, which ensure model parameters are evenly distributed across the entire network, such that higher layers (layers near the output) can have more capacity to store high-resolution content and video details. With content-adaptive embeddings and re-designed architecture, HNeRV outperforms implicit methods in video regression tasks for both reconstruction quality ($+4.7$ PSNR) and convergence speed ($16\times$ faster), and shows better internal generalization. As a simple and efficient video representation, HNeRV also shows decoding advantages for speed, flexibility, and deployment, compared to traditional codecs~(H.264, H.265) and learning-based compression methods. Finally, we explore the effectiveness of HNeRV on downstream tasks such as video compression and video inpainting. We provide project page at https://haochen-rye.github.io/HNeRV, and Code at https://github.com/haochen-rye/HNeRV

Hao Chen, Matt Gwilliam, Ser-Nam Lim, Abhinav Shrivastava• 2023

Related benchmarks

TaskDatasetResultRank
Video ReconstructionBunny
PSNR37.74
34
Video CompressionUVG standard (full)
Beauty Quality Score34.3
24
Video InpaintingDAVIS
PSNR32.03
22
Implicit Video RepresentationUVG-HD full 1920x1080
PSNR (Beauty)34.3
18
Video RepresentationUVG
Encoding FPS24.6
18
Video RepresentationUVG (test)
Beauty0.9066
18
Video RepresentationBunny dataset
PSNR36.95
18
Video ReconstructionDAVIS
PSNR28.93
15
Video ReconstructionUVG (test)
Beauty Score33.88
13
Video DecodingVideos
FPS395.9
12
Showing 10 of 28 rows

Other info

Code

Follow for update