MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model

About

In the realm of data-driven AI technology, the application of open-source large language models (LLMs) in robotic task planning represents a significant milestone. Recent robotic task planning methods based on open-source LLMs typically leverage vast task planning datasets to enhance models' planning abilities. While these methods show promise, they struggle with complex long-horizon tasks, which require comprehending more context and generating longer action sequences. This paper addresses this limitation by proposing MLDT, theMulti-Level Decomposition Task planning method. This method innovatively decomposes tasks at the goal-level, task-level, and action-level to mitigate the challenge of complex long-horizon tasks. In order to enhance open-source LLMs' planning abilities, we introduce a goal-sensitive corpus generation method to create high-quality training data and conduct instruction tuning on the generated corpus. Since the complexity of the existing datasets is not high enough, we construct a more challenging dataset, LongTasks, to specifically evaluate planning ability on complex long-horizon tasks. We evaluate our method using various LLMs on four datasets in VirtualHome. Our results demonstrate a significant performance enhancement in robotic task planning, showcasing MLDT's effectiveness in overcoming the limitations of existing methods based on open-source LLMs as well as its practicality in complex, real-world scenarios.

Yike Wu, Jiatao Zhang, Nan Hu, LanLing Tang, Guilin Qi, Jun Shao, Jie Ren, Wei Song• 2024

Related benchmarks

Task	Dataset	Result
Embodied Agent Planning (Adversarial Safety Evaluation)	SafeAgentBench Unsafe Tasks - Jailbreak	SR40.13	12
Embodied Agent Planning (Safety Evaluation)	SafeAgentBench Unsafe Tasks	Success Rate44.14	12
Embodied Agent Planning	SafeAgentBench Safe Tasks	Success Rate58.86	12
Monkey-banana task execution	Dual Bananas Scene 4	Success Rate99	6
Robotic Planning	Comprehensive Scene 13	SR98	6
Robotic Task Planning	Classic (Scene 2)	Success Rate85	6
Monkey-banana task execution	Dual Bananas Scene 5	Success Rate65	6
Robotic Planning	Comprehensive (Scene 14)	SR12	6
Robotic Task Planning	Classic (Scene 1)	Success Rate (SR)98	6
Task Planning	Shortsighted Monkey (Scene 7)	SR96	6

Showing 10 of 19 rows

Other info

Follow for update

@wizwand_team Discord