Multi-Level Neural Scene Graphs for Dynamic Urban Environments
About
We estimate the radiance field of large-scale dynamic areas from multiple vehicle captures under varying environmental conditions. Previous works in this domain are either restricted to static environments, do not scale to more than a single short video, or struggle to separately represent dynamic object instances. To this end, we present a novel, decomposable radiance field approach for dynamic urban environments. We propose a multi-level neural scene graph representation that scales to thousands of images from dozens of sequences with hundreds of fast-moving objects. To enable efficient training and rendering of our representation, we develop a fast composite ray sampling and rendering scheme. To test our approach in urban driving scenarios, we introduce a new, novel view synthesis benchmark. We show that our approach outperforms prior art by a significant margin on both established and our proposed benchmark while being faster in training and rendering.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Novel View Synthesis | KITTI 75% views (train) | PSNR28.38 | 14 | |
| Novel View Synthesis | KITTI 50% views (train) | PSNR27.51 | 14 | |
| Novel View Synthesis | KITTI 25% views (train) | PSNR26.51 | 10 | |
| Novel View Synthesis | VKITTI 2 (25% train views) | PSNR28.29 | 10 | |
| Novel View Synthesis | KITTI 75% (test) | PSNR28.38 | 7 | |
| Novel View Synthesis | KITTI 50% (test) | PSNR27.51 | 7 | |
| Novel View Synthesis | KITTI 25% (test) | PSNR26.51 | 7 | |
| Novel View Synthesis | VKITTI2 75% (test) | PSNR29.73 | 7 | |
| Novel View Synthesis | VKITTI2 50% (test) | PSNR29.19 | 7 | |
| Novel View Synthesis | VKITTI2 25% (test) | PSNR28.29 | 7 |