DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving
About
Most end-to-end autonomous driving methods rely on imitation learning from single expert demonstrations, often leading to conservative and homogeneous behaviors that limit generalization in complex real-world scenarios. In this work, we propose DIVER, an end-to-end driving framework that integrates reinforcement learning with diffusion-based generation to produce diverse and feasible trajectories. At the core of DIVER lies a reinforced diffusion-based generation mechanism. First, the model conditions on map elements and surrounding agents to generate multiple reference trajectories from a single ground-truth trajectory, alleviating the limitations of imitation learning that arise from relying solely on single expert demonstrations. Second, reinforcement learning is employed to guide the diffusion process, where reward-based supervision enforces safety and diversity constraints on the generated trajectories, thereby enhancing their practicality and generalization capability. Furthermore, to address the limitations of L2-based open-loop metrics in capturing trajectory diversity, we propose a novel Diversity metric to evaluate the diversity of multi-mode predictions.Extensive experiments on the closed-loop NAVSIM and Bench2Drive benchmarks, as well as the open-loop nuScenes dataset, demonstrate that DIVER significantly improves trajectory diversity, effectively addressing the mode collapse problem inherent in imitation learning.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Closed-loop Planning | Bench2Drive | Driving Score49.21 | 152 | |
| Autonomous Driving | NAVSIM v1 (test) | NC98.5 | 147 | |
| Autonomous Driving Planning | NAVSIM v1 | NC98.5 | 126 | |
| Autonomous Driving Planning | NAVSIM v1 (test) | NC98.5 | 118 | |
| Autonomous Driving Planning | NAVSIM navhard v2 | NC96.4 | 88 | |
| Autonomous Driving | Bench2Drive | Merging Score35.08 | 43 | |
| Closed-loop Planning | NAVSIM | NC Metric98.5 | 40 | |
| Planning | NAVSIM v2 (Navtest) | NC97.5 | 24 | |
| End-to-end Autonomous Driving | Bench2Drive base V0.0.3 (train) | DS Score68.9 | 16 | |
| End-to-End Autonomous Driving Planning | NAVSIM v1 (navtest) | NC Score0.985 | 16 |