Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models

About

This study focuses on using large language models (LLMs) as a planner for embodied agents that can follow natural language instructions to complete complex tasks in a visually-perceived environment. The high data cost and poor sample efficiency of existing methods hinders the development of versatile agents that are capable of many tasks and can learn new tasks quickly. In this work, we propose a novel method, LLM-Planner, that harnesses the power of large language models to do few-shot planning for embodied agents. We further propose a simple but effective way to enhance LLMs with physical grounding to generate and update plans that are grounded in the current environment. Experiments on the ALFRED dataset show that our method can achieve very competitive few-shot performance: Despite using less than 0.5% of paired training data, LLM-Planner achieves competitive performance with recent baselines that are trained using the full training data. Existing methods can barely complete any task successfully under the same few-shot setting. Our work opens the door for developing versatile and sample-efficient embodied agents that can quickly learn many tasks. Website: https://dki-lab.github.io/LLM-Planner

Chan Hee Song, Jiaman Wu, Clayton Washington, Brian M. Sadler, Wei-Lun Chao, Yu Su• 2022

Related benchmarks

TaskDatasetResultRank
Instruction FollowingALFRED (test-unseen)
GC23.37
31
Continual Instruction FollowingALFRED
Success Rate (SR)58.44
28
Embodied Task CompletionALFRED seen (test)
Success Rate (SR)18.2
26
Embodied Task CompletionALFRED unseen (test)
Success Rate13.41
26
Long-horizon Mathematical ReasoningMATH
Result Accuracy52.08
23
Embodied Instruction FollowingALFRED seen 1.0 (test)
GC24.57
20
Embodied Task PlanningVirtualHome (Seen)--
18
Dual-arm task planningKitchen Scene
TEI0.956
16
Dual-arm task planningAgricultural Greenhouse Scene
TFR43.1
16
Continual Instruction FollowingVirtualHome
SR40.97
15
Showing 10 of 57 rows

Other info

Follow for update