Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM+P: Empowering Large Language Models with Optimal Planning Proficiency

About

Large language models (LLMs) have demonstrated remarkable zero-shot generalization abilities: state-of-the-art chatbots can provide plausible answers to many common questions that arise in daily life. However, so far, LLMs cannot reliably solve long-horizon planning problems. By contrast, classical planners, once a problem is given in a formatted way, can use efficient search algorithms to quickly identify correct, or even optimal, plans. In an effort to get the best of both worlds, this paper introduces LLM+P, the first framework that incorporates the strengths of classical planners into LLMs. LLM+P takes in a natural language description of a planning problem, then returns a correct (or optimal) plan for solving that problem in natural language. LLM+P does so by first converting the language description into a file written in the planning domain definition language (PDDL), then leveraging classical planners to quickly find a solution, and then translating the found solution back into natural language. Along with LLM+P, we define a diverse set of different benchmark problems taken from common planning scenarios. Via a comprehensive set of experiments on these benchmark problems, we find that LLM+P is able to provide optimal solutions for most problems, while LLMs fail to provide even feasible plans for most problems.\footnote{The code and results are publicly available at https://github.com/Cranial-XIX/llm-pddl.git.

Bo Liu, Yuqian Jiang, Xiaohan Zhang, Qiang Liu, Shiqi Zhang, Joydeep Biswas, Peter Stone• 2023

Related benchmarks

TaskDatasetResultRank
Robotic Task Planning in Dynamic EnvironmentsVirtualHome
Success Rate44
16
Robotic Task Planning in Dynamic EnvironmentsOmniGibson
Success Rate37
16
Robot Manipulation PlanningOWL-TAMP Citrus
Success Rate100
8
Robot Manipulation PlanningOWL-TAMP Berry2
Success Rate100
8
Robot Manipulation PlanningOWL-TAMP Berry1
Success Rate1
8
Robot Manipulation PlanningOWL-TAMP Mug3
Success Rate20
8
Robot Manipulation PlanningOWL-TAMP Overall
Success Rate32
8
Robot Manipulation PlanningOWL-TAMP Mug2
Success Rate0.00e+0
8
Robot Manipulation PlanningOWL-TAMP SoupPour
Success Rate0.00e+0
8
Robot Manipulation PlanningOWL-TAMP BerryCook
Success Rate0.00e+0
8
Showing 10 of 42 rows

Other info

Follow for update