A Goal Without a Plan Is Just a Wish: Efficient and Effective Global Planner Training for Long-Horizon Agent Tasks
About
Agents based on large language models (LLMs) struggle with brainless trial-and-error and generating hallucinatory actions due to a lack of global planning in long-horizon tasks. In this paper, we introduce a plan-and-execute framework and propose EAGLET, an efficient and effective planner training method to enhance the executor agent's planning abilities without human effort. Specifically, we train a plug-and-play global planner through a two-step process: we first synthesize high-quality plans from an advanced LLM using our proposed homologous consensus filtering strategy, and apply fine-tuning as a cold start. Moreover, we further improve the planner with a rule-based reinforcement learning stage using a novel executor capability gain reward, ensuring it can handle task instructions of varying difficulty. Experiments on three long-horizon agent tasks show that executor agents equipped with our planner outperform existing methods, achieving new state-of-the-art performance. Meanwhile, EAGLET reduces training costs by 8x compared to RL-based baselines, and it does not require manual effort or extra training data, offering an efficient and effective solution.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Interactive Environment Task Completion | ALFWorld Seen | Average Reward90.2 | 31 | |
| Interactive Environment Task Completion | ALFWorld Unseen | Average Reward91.8 | 31 | |
| Interactive Environment Task Completion | ScienceWorld Seen | Average Reward89.5 | 22 | |
| Interactive Environment Task Completion | ScienceWorld Unseen | Average Reward90.1 | 22 | |
| Interactive Environment Task Completion | WebShop (Seen) | Average Reward86.2 | 22 | |
| Online Shopping | WebShop (Seen) | Success Rate70 | 6 | |
| Task Completion | ScienceWorld Seen | Average Steps10.2 | 6 | |
| Task Completion | ScienceWorld Unseen | Average Steps10.6 | 6 | |
| Task Completion | ALFWorld Seen | Average Steps8.6 | 6 | |
| Task Completion | ALFWorld Unseen | Average Steps8.2 | 6 |