GIFStream: 4D Gaussian-based Immersive Video with Feature Stream
About
Immersive video offers a 6-Dof-free viewing experience, potentially playing a key role in future video technology. Recently, 4D Gaussian Splatting has gained attention as an effective approach for immersive video due to its high rendering efficiency and quality, though maintaining quality with manageable storage remains challenging. To address this, we introduce GIFStream, a novel 4D Gaussian representation using a canonical space and a deformation field enhanced with time-dependent feature streams. These feature streams enable complex motion modeling and allow efficient compression by leveraging temporal correspondence and motion-aware pruning. Additionally, we incorporate both temporal and spatial compression networks for end-to-end compression. Experimental results show that GIFStream delivers high-quality immersive video at 30 Mbps, with real-time rendering and fast decoding on an RTX 4090. Project page: https://xdimlab.github.io/GIFStream
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Dynamic Scene Reconstruction | N3DV (test) | PSNR31.75 | 32 | |
| Dynamic 3D Reconstruction | N3DV | PSNR (dB)31.75 | 16 | |
| Novel View Synthesis | Neur3D | PSNR31.75 | 8 | |
| Novel View Synthesis | Panoptic Sport basketball and boxes | PSNR29.5 | 7 | |
| Novel View Synthesis | MPEG | PSNR30.72 | 6 | |
| Long-range 4D Motion Modeling | SelfCapLR Yoga | PSNR (dB)22.02 | 6 | |
| Long-range 4D Motion Modeling | SelfCapLR Corgi newly composed | PSNR (dB)19.83 | 6 | |
| Long-range 4D Motion Modeling | SelfCapLR newly composed | PSNR (dB)19.02 | 6 | |
| Long-range motion modeling | SelfCapLR | tOF0.539 | 6 | |
| Long-range 4D Motion Modeling | SelfCapLR Bike1 newly composed | PSNR (dB)18.43 | 6 |