RAPiD: Real-time Deterministic Trajectory Planning via Diffusion Behavior Priors for Safe and Efficient Autonomous Driving
About
Diffusion-based trajectory planners have demonstrated strong capability for modeling the multimodal nature of human driving behavior, but their reliance on iterative stochastic sampling poses critical challenges for real-time, safety-critical deployment. In this work, we present RAPiD, a deterministic policy extraction framework that distills a pretrained diffusion-based planner into an efficient policy while eliminating diffusion sampling. Using score-regularized policy optimization, we leverage the score function of a pre-trained diffusion planner as a behavior prior to regularize policy learning. To promote safety and passenger comfort, the policy is optimized using a critic trained to imitate a predictive driver controller, providing dense, safety-focused supervision beyond conventional imitation learning. Evaluations demonstrate that RAPiD achieves competitive performance on closed-loop nuPlan scenarios with an 8x speedup over diffusion baselines, while achieving state-of-the-art generalization among learning-based planners on the interPlan benchmark. The official website of this work is: https://github.com/ruturajreddy/RAPiD.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Closed-loop Planning | nuPlan 14 (val) | NR Score90.19 | 66 | |
| Closed-loop Planning | nuPlan 14 Hard (test) | NR76.09 | 64 | |
| Closed-loop Planning | nuPlan 14 (test) | NR89.98 | 45 | |
| Trajectory Planning | interPlan | interPlan Score27 | 10 | |
| Closed-loop Planning | nuPlan | Latency (ms)12.41 | 6 |