Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-agent Coordination via Flow Matching

About

This work presents MAC-Flow, a simple yet expressive framework for multi-agent coordination. We argue that requirements of effective coordination are twofold: (i) a rich representation of the diverse joint behaviors present in offline data and (ii) the ability to act efficiently in real time. However, prior approaches often sacrifice one for the other, i.e., denoising diffusion-based solutions capture complex coordination but are computationally slow, while Gaussian policy-based solutions are fast but brittle in handling multi-agent interaction. MAC-Flow addresses this trade-off by first learning a flow-based representation of joint behaviors, and then distilling it into decentralized one-step policies that preserve coordination while enabling fast execution. Across four different benchmarks, including $12$ environments and $34$ datasets, MAC-Flow alleviates the trade-off between performance and computational cost, specifically achieving about $\boldsymbol{\times14.5}$ faster inference compared to diffusion-based MARL methods, while maintaining good performance. At the same time, its inference speed is similar to that of prior Gaussian policy-based offline multi-agent reinforcement learning (MARL) methods.

Dongsu Lee, Daehee Lee, Amy Zhang• 2025

Related benchmarks

TaskDatasetResultRank
Multi-agent continuous controlMA-MuJoCo 6Halfcheetah-Medium
Average Performance5.14e+3
16
Multi-agent continuous controlMA-MuJoCo 3Hopper-Expert
Average Performance3.59e+3
8
Multi-agent continuous controlMA-MuJoCo 2Ant-Medium
Average Performance1.43e+3
8
Multi-agent continuous controlMA-MuJoCo 2Ant-MR
Average Performance1.50e+3
8
Multi-agent continuous controlMA-MuJoCo 2Ant-ME
Average Performance2.05e+3
8
Multi-agent continuous controlMA-MuJoCo 3Hopper-MR
Average Performance1.17e+3
8
Multi-agent continuous controlMA-MuJoCo 3Hopper-ME
Average Performance2.99e+3
8
Multi-agent continuous controlMA-MuJoCo 6Halfcheetah-Expert
Average Performance4.65e+3
8
Multi-agent continuous controlMA-MuJoCo 2Ant-Expert
Average Performance2.06e+3
8
Multi-agent continuous controlMA-MuJoCo 6Halfcheetah-MR
Average Performance3.03e+3
8
Showing 10 of 11 rows

Other info

Follow for update