Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AD-H: Language-guided Autonomous Driving with Hierarchical Agents

About

Language-guided autonomous driving requires bridging a large abstraction gap between high-level natural-language instructions and low-level vehicle control. End-to-end approaches that use a single multimodal large language model (MLLM) to map language directly to actions struggle with this mismatch, often failing to exploit the reasoning capabilities of the model and exhibiting limited generalization beyond the distributions of driving datasets used for fine-tuning. To address this issue, we propose AD-H, a hierarchical multi-agent framework that explicitly separates high-level decision-making from low-level vehicle execution. At the upper level, an MLLM-based planner interprets natural-language commands and environmental context to generate coherent mid-level driving instructions. At the lower level, a lightweight controller converts these mid-level instructions into precise, continuous control actions. This decomposition aligns with the functional strengths of each component: the planner focuses on semantic reasoning and task decomposition, while the controller ensures stable and accurate actuation. To support large-scale training under this hierarchy, we design a rule-based pipeline that reconstructs mid-level commands from driving signals, producing 1.15 million hierarchical annotation pairs. Extensive experiments show that AD-H outperforms state-of-the-art models despite using fewer parameters, namely 3B plus 350M compared with 7B, and achieves superior long-horizon generalization and instruction-following performance. We make our data and code publicly accessible at https://github.com/zhangzaibin/AD-H

Zaibin Zhang, Talas Fu, Shiyu Tang, Yuanhang Zhang, Yifan Wang, Lijun Wang, Huchuan Lu• 2024

Related benchmarks

TaskDatasetResultRank
End-to-end DrivingLangAuto Tiny
DS77.5
21
End-to-end DrivingLangAuto Short
DS56.1
21
Language-conditioned Autonomous DrivingLangAuto Tiny
DS Score77.5
13
Language-conditioned Autonomous DrivingLangAuto Short
DS56.1
13
Instruction-driven Autonomous DrivingLangAuto (full)
DS Score44
9
Language-guided Autonomous DrivingLangAuto Long
DS44
8
Language-guided Autonomous DrivingLangAuto Mean
DS59.2
8
Autonomous DrivingLangAuto (full)
DS Score44
5
Language-conditioned Autonomous DrivingLangAuto
DS Score44
4
Autonomous DrivingLangAuto (novel-environment)
Driving Score (DS)59.9
3
Showing 10 of 11 rows

Other info

Follow for update