Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MPO: Boosting LLM Agents with Meta Plan Optimization

About

Recent advancements in large language models (LLMs) have enabled LLM-based agents to successfully tackle interactive planning tasks. However, despite their successes, existing approaches often suffer from planning hallucinations and require retraining for each new agent. To address these challenges, we propose the Meta Plan Optimization (MPO) framework, , which enhances agent planning capabilities by directly incorporating explicit guidance. Unlike previous methods that rely on complex knowledge, which either require significant human effort or lack quality assurance, MPO leverages high-level general guidance through meta plans to assist agent planning and enables continuous optimization of the meta plans based on feedback from the agent's task execution. Our experiments conducted on two representative tasks demonstrate that MPO significantly outperforms existing baselines. Moreover, our analysis indicates that MPO provides a plug-and-play solution that enhances both task completion efficiency and generalization capabilities in previous unseen scenarios.

Weimin Xiong, Yifan Song, Qingxiu Dong, Bingchan Zhao, Feifan Song, Xun Wang, Sujian Li• 2025

Related benchmarks

TaskDatasetResultRank
Interactive Decision-makingScienceWorld Seen
Success Rate80.91
72
Interactive Decision-makingWebShop (test)
Success Rate87.5
37
Interactive Decision-makingScienceWorld Unseen
Success Rate77.46
32
Interactive Decision-makingALFWorld Seen
Success Rate82.9
32
Interactive Decision-makingALFWorld Unseen
Success Rate78.4
32
Interactive Environment Task CompletionALFWorld Seen
Average Reward88.2
31
Interactive Environment Task CompletionALFWorld Unseen
Average Reward88.1
31
Interactive Environment Task CompletionScienceWorld Seen
Average Reward87.8
22
Interactive Environment Task CompletionScienceWorld Unseen
Average Reward89
22
Interactive Environment Task CompletionWebShop (Seen)
Average Reward83.5
22
Showing 10 of 23 rows

Other info

Follow for update