DiffAD: A Unified Diffusion Modeling Approach for Autonomous Driving
About
End-to-end autonomous driving (E2E-AD) has rapidly emerged as a promising approach toward achieving full autonomy. However, existing E2E-AD systems typically adopt a traditional multi-task framework, addressing perception, prediction, and planning tasks through separate task-specific heads. Despite being trained in a fully differentiable manner, they still encounter issues with task coordination, and the system complexity remains high. In this work, we introduce DiffAD, a novel diffusion probabilistic model that redefines autonomous driving as a conditional image generation task. By rasterizing heterogeneous targets onto a unified bird's-eye view (BEV) and modeling their latent distribution, DiffAD unifies various driving objectives and jointly optimizes all driving tasks in a single framework, significantly reducing system complexity and harmonizing task coordination. The reverse process iteratively refines the generated BEV image, resulting in more robust and realistic driving behaviors. Closed-loop evaluations in Carla demonstrate the superiority of the proposed method, achieving a new state-of-the-art Success Rate and Driving Score.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| End-to-end Autonomous Driving | Bench2Drive | Driving Score67.92 | 27 | |
| Closed-loop Planning | Bench2Drive (test) | Driving Score67.92 | 21 | |
| Autonomous Driving | Bench2Drive base (train) | Driving Score67.92 | 19 | |
| End-to-end Autonomous Driving | Bench2Drive (test) | Driving Score67.92 | 13 | |
| Autonomous Driving | Bench2Drive Multi-Ability Benchmark (test) | Merging Score30 | 10 | |
| Open-loop planning | Bench2Drive (test) | Avg L2 Error (m)1.55 | 8 | |
| Autonomous Driving | Bench2Drive | Avg. L21.55 | 4 | |
| End-to-end Autonomous Driving | nuScenes | Parameters (M)545.6 | 3 |