Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Textual Planning with Explicit Latent Transitions

About

Planning with LLMs is bottlenecked by token-by-token generation and repeated full forward passes, making multi-step lookahead and rollout-based search expensive in latency and compute. We propose EmbedPlan, which replaces autoregressive next-state generation with a lightweight transition model operating in a frozen language embedding space. EmbedPlan encodes natural language state and action descriptions into vectors, predicts the next-state embedding, and retrieves the next state by nearest-neighbor similarity, enabling fast planning computation without fine-tuning the encoder. We evaluate next-state prediction across nine classical planning domains using six evaluation protocols of increasing difficulty: interpolation, plan-variant, extrapolation, multi-domain, cross-domain, and leave-one-out. Results show near-perfect interpolation performance but a sharp degradation when generalization requires transfer to unseen problems or unseen domains; plan-variant evaluation indicates generalization to alternative plans rather than memorizing seen trajectories. Overall, frozen embeddings support within-domain dynamics learning after observing a domain's transitions, while transfer across domain boundaries remains a bottleneck.

Eliezer Shlomi, Ido Levy, Eilam Shapira, Michael Katz, Guy Uziel, Segev Shlomov, Nir Mashkif, Roi Reichart, Sarah Keren• 2026

Related benchmarks

TaskDatasetResultRank
Textual PlanningPDDL domains Untrained protocol
Hit@53.9
1
Textual PlanningPDDL domains Cross-Domain protocol
Hit@56.6
1
Textual PlanningPDDL domains Leave-One-Out protocol
Hit@59.2
1
Textual PlanningPDDL domains Multi-Domain Ex. protocol
Hit@537.2
1
Textual PlanningPDDL domains Single-Domain Ex. protocol
Hit@50.546
1
Textual PlanningPDDL domains Plan-Variant protocol
Hit@551.2
1
Textual PlanningPDDL domains Single-Domain In. protocol
Hit@599.7
1
Showing 7 of 7 rows

Other info

Follow for update