MIMIC-D: Multi-modal Imitation for MultI-agent Coordination with Decentralized Diffusion Policies
About
As robots become more integrated in society, their ability to coordinate with other robots and humans on multi-modal tasks (those with multiple valid solutions) is crucial. Such behaviors can be learned from expert demonstrations via imitation learning (IL), but when expert demonstrations are multi-modal, standard IL approaches usually average across modes or collapse to a single mode, preventing effective coordination. Being inspired by diffusion models' ability to capture complex multi-modal trajectory distributions in single-agent settings, we develop a diffusion-based framework for coordinated multi-modal behavior in multi-agent systems. However, existing multi-agent diffusion approaches typically require a centralized planner or explicit communication among agents. This assumption can fail in real-world scenarios where robots must operate independently or with agents like humans that they cannot directly communicate with. Therefore, we propose MIMIC-D, a joint training with decentralized execution paradigm for multi-modal multi-agent IL via diffusion. We jointly train all agents' policies with only local information to achieve implicit coordination. In simulation and hardware experiments, our method exhibits robust multi-modal coordination behavior in various tasks and environments, improving upon state-of-the-art baselines.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Road Crossing | Three-Agent Road Crossing | Collision Rate (per 100 Steps)0.00e+0 | 16 | |
| Multi-robot coordination | Room Map | Success Rate40 | 12 | |
| Multi-robot coordination | Shelf Map | Success Rate52 | 12 | |
| Multi-robot coordination | Basic Map | Success Rate8 | 12 | |
| Multi-robot coordination | Dense Map | Success Rate12 | 12 | |
| Multi-agent Navigation | Basic Map | Success Rate0.00e+0 | 12 | |
| Multi-agent Navigation | Dense Map | Success Rate0.00e+0 | 12 | |
| Multi-robot path planning | Original Benchmark Room map | Running Time19.99 | 11 | |
| Multi-robot path planning | Original Benchmark Shelf map | Running Time19.37 | 11 | |
| Multi-robot path planning | Original Benchmark Basic map | Running Time7.52 | 9 |