StyleDrive: Towards Driving-Style Aware Benchmarking of End-To-End Autonomous Driving
About
Personalization, while extensively studied in conventional autonomous driving pipelines, has been largely overlooked in the context of end-to-end autonomous driving (E2EAD), despite its critical role in fostering user trust, safety perception, and real-world adoption. A primary bottleneck is the absence of large-scale real-world datasets that systematically capture driving preferences, severely limiting the development and evaluation of personalized E2EAD models. In this work, we introduce the first large-scale real-world dataset explicitly curated for personalized E2EAD, integrating comprehensive scene topology with rich dynamic context derived from agent dynamics and semantics inferred via a fine-tuned vision-language model (VLM). We propose a hybrid annotation pipeline that combines behavioral analysis, rule-and-distribution-based heuristics, and subjective semantic modeling guided by VLM reasoning, with final refinement through human-in-the-loop verification. Building upon this dataset, we introduce the first standardized benchmark for systematically evaluating personalized E2EAD models. Empirical evaluations on state-of-the-art architectures demonstrate that incorporating personalized driving preferences significantly improves behavioral alignment with human demonstrations.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Closed-loop Autonomous Driving | Bench2Drive | Driving Score (DS)77.02 | 49 | |
| Instruction-following Trajectory Evaluation | Personalized Driving Dataset Emergency Brake Scenario v1.0 (test) | Error Metric 17.2 | 9 | |
| Instruction-following Trajectory Evaluation | Personalized Driving Dataset Merging Scenario v1.0 (test) | Evaluation Score 18 | 9 | |
| Instruction-following Trajectory Evaluation | Personalized Driving Dataset Overtaking Scenario v1.0 (test) | E1 Score8.1 | 9 | |
| Instruction-following Trajectory Evaluation | Personalized Driving Dataset Traffic Sign Scenario v1.0 (test) | Trajectory Score E18.3 | 9 |