NeRFPlayer: A Streamable Dynamic Scene Representation with Decomposed Neural Radiance Fields
About
Visually exploring in a real-world 4D spatiotemporal space freely in VR has been a long-term quest. The task is especially appealing when only a few or even single RGB cameras are used for capturing the dynamic scene. To this end, we present an efficient framework capable of fast reconstruction, compact modeling, and streamable rendering. First, we propose to decompose the 4D spatiotemporal space according to temporal characteristics. Points in the 4D space are associated with probabilities of belonging to three categories: static, deforming, and new areas. Each area is represented and regularized by a separate neural field. Second, we propose a hybrid representations based feature streaming scheme for efficiently modeling the neural fields. Our approach, coined NeRFPlayer, is evaluated on dynamic scenes captured by single hand-held cameras and multi-camera arrays, achieving comparable or superior rendering performance in terms of quality and speed comparable to recent state-of-the-art methods, achieving reconstruction in 10 seconds per frame and interactive rendering.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Novel View Synthesis | Neural 3D Video Dataset Standard (All six scenes) | PSNR30.69 | 36 | |
| Dynamic Scene Reconstruction | N3DV (test) | PSNR30.69 | 32 | |
| Novel View Synthesis | Neu3D (test) | PSNR30.69 | 18 | |
| Dynamic Scene Reconstruction | Neural 3D Video 19 (full) | PSNR30.96 | 17 | |
| Dynamic View Synthesis | Neural 3D Video 19 (test) | PSNR30.96 | 16 | |
| 3D Video Synthesis | Neural 3D Video Dataset (Cut Roasted Beef scene) | PSNR29.35 | 12 | |
| Novel View Rendering | N3DV Flame Steak | PSNR31.93 | 11 | |
| Novel View Rendering | N3DV Cook Spinach | PSNR30.58 | 11 | |
| Novel View Rendering | N3DV Sear Steak | PSNR29.13 | 11 | |
| Novel View Rendering | N3DV Cut Roast Beef | PSNR29.35 | 11 |