Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Closed-Loop Vision-Language Planning for Multi-Agent Coordination

About

Cooperative multi-agent reinforcement learning (MARL) struggles with sample efficiency, interpretability, and generalization. While Large Language Models (LLMs) offer powerful planning capabilities, their application has been hampered by a reliance on text-only inputs and a failure to handle the non-Markovian, partially observable nature of multi-agent tasks. We introduce COMPASS, a multi-agent framework that overcomes these limitations by integrating Vision-Language Models (VLMs) for decentralized, closed-loop decision-making. COMPASS dynamically generates and refines interpretable, code-based strategies stored in a skill library that is bootstrapped from expert demonstrations. To ensure robust coordination, it propagates entity information through a structured multi-hop communication protocol, allowing teams to build a coherent understanding from partial observations. Evaluated on the challenging SMACv2 benchmark, COMPASS significantly outperforms state-of-the-art MARL baselines. Notably, in the symmetric Protoss 5v5 task, COMPASS achieved a 57\% win rate, a 30 percentage point advantage over QMIX (27\%). Project page can be found at https://stellar-entremet-1720bb.netlify.app/.

Zhiyuan Li, Wenshuai Zhao, Joni Pajarinen• 2025

Related benchmarks

TaskDatasetResultRank
Multi-agent coordinationSMAC Terran 5v5 v2
Median Win Rate39
10
Multi-agent coordinationSMAC Zerg 5v5 v2
Median Win Rate18
10
Multi-agent coordinationSMAC Protoss 5v5 v2
Median Win Rate57
9
Multi-agent coordinationSMAC Protoss 5v6 v2
Median Win Rate8
9
Multi-agent coordinationSMAC Terran 5v6 v2
Median Win Rate10
9
Multi-agent coordinationSMAC Zerg 5v6 v2
Median Win Rate4
9
Showing 6 of 6 rows

Other info

Follow for update