Towards Realistic Scene Generation with LiDAR Diffusion Models
About
Diffusion models (DMs) excel in photo-realistic image synthesis, but their adaptation to LiDAR scene generation poses a substantial hurdle. This is primarily because DMs operating in the point space struggle to preserve the curve-like patterns and 3D geometry of LiDAR scenes, which consumes much of their representation power. In this paper, we propose LiDAR Diffusion Models (LiDMs) to generate LiDAR-realistic scenes from a latent space tailored to capture the realism of LiDAR scenes by incorporating geometric priors into the learning pipeline. Our method targets three major desiderata: pattern realism, geometry realism, and object realism. Specifically, we introduce curve-wise compression to simulate real-world LiDAR patterns, point-wise coordinate supervision to learn scene geometry, and patch-wise encoding for a full 3D object context. With these three core designs, our method achieves competitive performance on unconditional LiDAR generation in 64-beam scenario and state of the art on conditional LiDAR generation, while maintaining high efficiency compared to point-based DMs (up to 107$\times$ faster). Furthermore, by compressing LiDAR scenes into a latent space, we enable the controllability of DMs with various conditions such as semantic maps, camera views, and text prompts.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Unconditional LiDAR Generation | KITTI360 (val) | FSVD38.8 | 11 | |
| Semantic Occupancy Prediction | Nuplan-Occ mini (val) | IoU5.5 | 10 | |
| LiDAR Densification | KITTI-360 64-beam, ~120K to ~250K (val) | CD (m)0.1937 | 9 | |
| LiDAR Densification | nuScenes 32-beam (val) | CD (m)0.2193 | 9 | |
| LiDAR Scene Generation | KITTI-360 (val) | FRD334.6 | 9 | |
| Unconditional LiDAR Generation | KITTI-360 19 | FRD334.6 | 8 | |
| LiDAR Scene Generation | nuScenes 2 | FPD30.77 | 7 | |
| LiDAR point cloud generation | KITTI-360 Text conditioned | FRD80.61 | 6 | |
| Unconditional LiDAR Generation | KITTI-360 (train-val) | FSVD16.54 | 6 | |
| Unconditional LiDAR Generation | KITTI-360 (val) | FSVD13.68 | 6 |