Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

X-Driver: Explainable Autonomous Driving with Vision-Language Models

About

End-to-end autonomous driving has advanced significantly, offering benefits such as system simplicity and stronger driving performance in both open-loop and closed-loop settings than conventional pipelines. However, existing frameworks still suffer from low success rates in closed-loop evaluations, highlighting their limitations in real-world deployment. In this paper, we introduce X-Driver, a unified multi-modal large language models(MLLMs) framework designed for closed-loop autonomous driving, leveraging Chain-of-Thought(CoT) and autoregressive modeling to enhance perception and decision-making. We validate X-Driver across multiple autonomous driving tasks using public benchmarks in CARLA simulation environment, including Bench2Drive[6]. Our experimental results demonstrate superior closed-loop performance, surpassing the current state-of-the-art(SOTA) while improving the interpretability of driving decisions. These findings underscore the importance of structured reasoning in end-to-end driving and establish X-Driver as a strong baseline for future research in closed-loop autonomous driving.

Wei Liu, Jiyuan Zhang, Binxiong Zheng, Yufeng Hu, Yingzhan Lin, Zengfeng Zeng• 2025

Related benchmarks

TaskDatasetResultRank
Closed-loop PlanningBench2Drive
Driving Score51.7
90
Closed-loop Autonomous DrivingBench2Drive closed-loop
DS51.7
24
Closed-loop Autonomous Drivingbench2drive 50 closed-loop
Driving Score57.8
3
Showing 3 of 3 rows

Other info

Follow for update