Textual Planning with Explicit Latent Transitions

About

Planning with LLMs is bottlenecked by token-by-token generation and repeated full forward passes, making multi-step lookahead and rollout-based search expensive in latency and compute. We propose EmbedPlan, which replaces autoregressive next-state generation with a lightweight transition model operating in a frozen language embedding space. EmbedPlan encodes natural language state and action descriptions into vectors, predicts the next-state embedding, and retrieves the next state by nearest-neighbor similarity, enabling fast planning computation without fine-tuning the encoder. We evaluate next-state prediction across nine classical planning domains using six evaluation protocols of increasing difficulty: interpolation, plan-variant, extrapolation, multi-domain, cross-domain, and leave-one-out. Results show near-perfect interpolation performance but a sharp degradation when generalization requires transfer to unseen problems or unseen domains; plan-variant evaluation indicates generalization to alternative plans rather than memorizing seen trajectories. Overall, frozen embeddings support within-domain dynamics learning after observing a domain's transitions, while transfer across domain boundaries remains a bottleneck.

Eliezer Shlomi, Ido Levy, Eilam Shapira, Michael Katz, Guy Uziel, Segev Shlomov, Nir Mashkif, Roi Reichart, Sarah Keren• 2026

Related benchmarks

Task	Dataset	Result
Textual Planning	PDDL domains Untrained protocol	Hit@53.9	1
Textual Planning	PDDL domains Cross-Domain protocol	Hit@56.6	1
Textual Planning	PDDL domains Leave-One-Out protocol	Hit@59.2	1
Textual Planning	PDDL domains Multi-Domain Ex. protocol	Hit@537.2	1
Textual Planning	PDDL domains Single-Domain Ex. protocol	Hit@50.546	1
Textual Planning	PDDL domains Plan-Variant protocol	Hit@551.2	1
Textual Planning	PDDL domains Single-Domain In. protocol	Hit@599.7	1

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord