Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MASPO: Joint Prompt Optimization for LLM-based Multi-Agent Systems

About

Large language model (LLM)-based Multi-agent systems (MAS) have shown promise in tackling complex collaborative tasks, where agents are typically orchestrated via role-specific prompts. While the quality of these prompts is pivotal, jointly optimizing them across interacting agents remains a non-trivial challenge, primarily due to the misalignment between local agent objectives and holistic system goals. To address this, we introduce MASPO, a novel framework designed to automatically and iteratively refine prompts across the entire system. A core innovation of MASPO is its joint evaluation mechanism, which assesses prompts not merely by their local validity, but by their capacity to facilitate downstream success for successor agents. This effectively bridges the gap between local interactions and global outcomes without relying on ground-truth labels. Furthermore, MASPO employs a data-driven evolutionary beam search to efficiently navigate the high-dimensional prompt space. Extensive empirical evaluations across 6 diverse tasks demonstrate that MASPO consistently outperforms state-of-the-art prompt optimization methods, achieving an average accuracy improvement of 2.9. We release our code at https://github.com/wangzx1219/MASPO.

Zhexuan Wang, Xuebo Liu, Li Wang, Zifei Shan, Yutong Wang, Zhenxi Song, Min Zhang• 2026

Related benchmarks

TaskDatasetResultRank
Math ReasoningAQUA
Accuracy87.01
188
Code GenerationHumanEval-ET--
108
ReasoningGPQA Diamond
Accuracy58.08
36
Mathematical ProficiencyMATH 500
Accuracy (MATH 500)78.4
13
Mathematical ProficiencyAGIEval MATH Level-5
Accuracy64.45
13
Showing 5 of 5 rows

Other info

Follow for update