LaMMA-P: Generalizable Multi-Agent Long-Horizon Task Allocation and Planning with LM-Driven PDDL Planner

About

Language models (LMs) possess a strong capability to comprehend natural language, making them effective in translating human instructions into detailed plans for simple robot tasks. Nevertheless, it remains a significant challenge to handle long-horizon tasks, especially in subtask identification and allocation for cooperative heterogeneous robot teams. To address this issue, we propose a Language Model-Driven Multi-Agent PDDL Planner (LaMMA-P), a novel multi-agent task planning framework that achieves state-of-the-art performance on long-horizon tasks. LaMMA-P integrates the strengths of the LMs' reasoning capability and the traditional heuristic search planner to achieve a high success rate and efficiency while demonstrating strong generalization across tasks. Additionally, we create MAT-THOR, a comprehensive benchmark that features household tasks with two different levels of complexity based on the AI2-THOR environment. The experimental results demonstrate that LaMMA-P achieves a 105% higher success rate and 36% higher efficiency than existing LM-based multiagent planners. The experimental videos, code, datasets, and detailed prompts used in each module can be found on the project website: https://lamma-p.github.io.

Xiaopan Zhang, Hao Qin, Fuquan Wang, Yue Dong, Jiachen Li• 2024

Related benchmarks

Task	Dataset	Result
Multi-agent robot coordination	Multi-agent Robot Service Tasks Simulation	Success Rate56	14
Multi-agent robot coordination	Multi-agent Robot Service Tasks Real-world	SR47	14
Embodied Agent Planning	SafeAgentBench Safe Tasks	Success Rate67.22	12
Embodied Agent Planning (Adversarial Safety Evaluation)	SafeAgentBench Unsafe Tasks - Jailbreak	SR42.14	12
Embodied Agent Planning (Safety Evaluation)	SafeAgentBench Unsafe Tasks	Success Rate53.18	12
Multi-robot long-horizon planning	MAT-THOR Vague	TCR (%)57.1	6
Multi-robot long-horizon planning	MAT-THOR Basic	TCR (%)60	6
Multi-robot long-horizon planning	MAT-THOR Complex	TCR36.8	6
Task Planning and Program Generation	IMR-Bench Simple Multi-Robot	OC71	6
Task Planning and Program Generation	IMR-Bench Complex Multi-Robot	OC56	6

Showing 10 of 21 rows

Other info

Follow for update

@wizwand_team Discord