Multi-agent Coordination via Flow Matching

About

This work presents MAC-Flow, a simple yet expressive framework for multi-agent coordination. We argue that requirements of effective coordination are twofold: (i) a rich representation of the diverse joint behaviors present in offline data and (ii) the ability to act efficiently in real time. However, prior approaches often sacrifice one for the other, i.e., denoising diffusion-based solutions capture complex coordination but are computationally slow, while Gaussian policy-based solutions are fast but brittle in handling multi-agent interaction. MAC-Flow addresses this trade-off by first learning a flow-based representation of joint behaviors, and then distilling it into decentralized one-step policies that preserve coordination while enabling fast execution. Across four different benchmarks, including $12$ environments and $34$ datasets, MAC-Flow alleviates the trade-off between performance and computational cost, specifically achieving about $\boldsymbol{\times14.5}$ faster inference compared to diffusion-based MARL methods, while maintaining good performance. At the same time, its inference speed is similar to that of prior Gaussian policy-based offline multi-agent reinforcement learning (MARL) methods.

Dongsu Lee, Daehee Lee, Amy Zhang• 2025

Related benchmarks

Task	Dataset	Result
Multi-agent continuous control	MA-MuJoCo 6Halfcheetah-Medium	Average Performance5.14e+3	16
Multi-agent continuous control	MA-MuJoCo 3Hopper-Expert	Average Performance3.59e+3	8
Multi-agent continuous control	MA-MuJoCo 2Ant-Medium	Average Performance1.43e+3	8
Multi-agent continuous control	MA-MuJoCo 2Ant-MR	Average Performance1.50e+3	8
Multi-agent continuous control	MA-MuJoCo 2Ant-ME	Average Performance2.05e+3	8
Multi-agent continuous control	MA-MuJoCo 3Hopper-MR	Average Performance1.17e+3	8
Multi-agent continuous control	MA-MuJoCo 3Hopper-ME	Average Performance2.99e+3	8
Multi-agent continuous control	MA-MuJoCo 6Halfcheetah-Expert	Average Performance4.65e+3	8
Multi-agent continuous control	MA-MuJoCo 2Ant-Expert	Average Performance2.06e+3	8
Multi-agent continuous control	MA-MuJoCo 6Halfcheetah-MR	Average Performance3.03e+3	8

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord