Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DNeRV: Modeling Inherent Dynamics via Difference Neural Representation for Videos

About

Existing implicit neural representation (INR) methods do not fully exploit spatiotemporal redundancies in videos. Index-based INRs ignore the content-specific spatial features and hybrid INRs ignore the contextual dependency on adjacent frames, leading to poor modeling capability for scenes with large motion or dynamics. We analyze this limitation from the perspective of function fitting and reveal the importance of frame difference. To use explicit motion information, we propose Difference Neural Representation for Videos (DNeRV), which consists of two streams for content and frame difference. We also introduce a collaborative content unit for effective feature fusion. We test DNeRV for video compression, inpainting, and interpolation. DNeRV achieves competitive results against the state-of-the-art neural compression approaches and outperforms existing implicit methods on downstream inpainting and interpolation for $960 \times 1920$ videos.

Qi Zhao, M. Salman Asif, Zhan Ma• 2023

Related benchmarks

TaskDatasetResultRank
Video ReconstructionBunny
PSNR34.09
34
Video ReconstructionDAVIS
PSNR29.66
22
Video ReconstructionUVG (test)
Beauty Score33.16
20
Neural Video RepresentationVideo per-frame
GFLOPs181
12
Video RepresentationDAVIS
PSNR (Average)30.39
11
Video RegressionUVG
Beauty40
10
Neural Video RepresentationLiterature Comparison
GFLOPs181
10
Video InpaintingDAVIS (central mask)
b-swan Score26.47
8
Video ReconstructionUVG 600 frames
Decoding Speed (FPS)52.2
8
Video InpaintingDAVIS 960 × 1920
Bmx-B25.7
6
Showing 10 of 17 rows

Other info

Follow for update