Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method
About
Driving scene generation is a critical domain for autonomous driving, enabling downstream applications, including perception and planning evaluation. Occupancy-centric methods have recently achieved state-of-the-art results by offering consistent conditioning across frames and modalities; however, their performance heavily depends on annotated occupancy data, which still remains scarce. To overcome this limitation, we curate Nuplan-Occ, the largest semantic occupancy dataset to date, constructed from the widely used Nuplan benchmark. Its scale and diversity facilitate not only large-scale generative modeling but also autonomous driving downstream applications. Based on this dataset, we develop a unified framework that jointly synthesizes high-quality semantic occupancy, multi-view videos, and LiDAR point clouds. Our approach incorporates a spatio-temporal disentangled architecture to support high-fidelity spatial expansion and temporal forecasting of 4D dynamic occupancy. To bridge modal gaps, we further propose two novel techniques: a Gaussian splatting-based sparse point map rendering strategy that enhances multi-view video generation, and a sensor-aware embedding strategy that explicitly models LiDAR sensor properties for realistic multi-LiDAR simulation. Extensive experiments demonstrate that our method achieves superior generation fidelity and scalability compared to existing approaches, and validates its practical value in downstream tasks. Repo: https://github.com/Arlo0o/UniScene-Unified-Occupancy-centric-Driving-Scene-Generation/tree/v2
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Occupancy Generation | Nuplan-Occ mini (val) | mIoU94.7 | 10 | |
| Semantic Occupancy Prediction | Nuplan-Occ mini (val) | IoU32.4 | 10 | |
| Video Generation | Nuplan-Occ mini (val) | FID8.32 | 7 | |
| Planning | NAVSIM Nuplan (test) | NC Score95.7 | 5 | |
| LiDAR Generation | Nuplan mini (val) | MMD0.457 | 4 | |
| Occupancy Generation | Nuplan-Occ full (val) | mIoU98.5 | 3 | |
| LiDAR Generation | Nuplan full (val) | MMD0.575 | 1 | |
| Video Generation | Nuplan-Occ (val) | FID7.59 | 1 |