Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GenePlan: Evolving Better Generalized PDDL Plans using Large Language Models

About

We present GenePlan (GENeralized Evolutionary Planner), a novel framework that leverages large language model (LLM) assisted evolutionary algorithms to generate domain-dependent generalized planners for classical planning tasks described in PDDL. By casting generalized planning as an optimization problem, GenePlan iteratively evolves interpretable Python planners that minimize plan length across diverse problem instances. In empirical evaluation across six existing benchmark domains and two new domains, GenePlan achieved an average SAT score of 0.91, closely matching the performance of the state-of-the-art planners (SAT score 0.93), and significantly outperforming other LLM-based baselines such as chain-of-thought (CoT) prompting (average SAT score 0.64). The generated planners solve new instances rapidly (average 0.49 seconds per task) and at low cost (average $1.82 per domain using GPT-4o).

Andrew Murray, Danial Dervovic, Alberto Pozanco, Michael Cashmore• 2026

Related benchmarks

TaskDatasetResultRank
Generalized PlanningPDDL manymiconic domain
Solution Rate100
10
Generalized PlanningPDDL trading domain
Solution Rate100
10
Generalized PlanningPDDL trapnewspapers domain
Solution Percentage100
10
Generalized PlanningPDDL
Solution Percentage100
10
Generalized PlanningPDDL manyferry domain
Solution Percentage100
10
Generalized PlanningPDDL manygripper domain
Solution Rate100
10
Generalized PlanningPDDL heavypack
Percent Solved100
10
Generalized PlanningPDDL hiking domain
Solution Rate100
10
Showing 8 of 8 rows

Other info

Follow for update