Multi-Agent Generative Adversarial Imitation Learning

About

Imitation learning algorithms can be used to learn a policy from expert demonstrations without access to a reward signal. However, most existing approaches are not applicable in multi-agent settings due to the existence of multiple (Nash) equilibria and non-stationary environments. We propose a new framework for multi-agent imitation learning for general Markov games, where we build upon a generalized notion of inverse reinforcement learning. We further introduce a practical multi-agent actor-critic algorithm with good empirical performance. Our method can be used to imitate complex behaviors in high-dimensional environments with multiple cooperative or competing agents.

Jiaming Song, Hongyu Ren, Dorsa Sadigh, Stefano Ermon• 2018

Related benchmarks

Task	Dataset	Result
Road Crossing	Three-Agent Road Crossing	Collision Rate (per 100 Steps)32	16
Multi-agent P2P energy trading optimization	IEEE 141-bus (test)	Mean Operational Cost (CNY)6.38e+3	6
Multi-agent Navigation	Two-Agent Swap 100 sampled trajectories	Agent-Agent Collisions97	4
Trajectory Distribution Matching	Two-Agent Swap Robosuite simulation (100 sampled trajectories)	EMD (Agent 1)4.67	4
Trajectory Distribution Matching	Three-Agent Road Crossing	EMD (Agent 1)1.2749	4
Two-Arm Lift	Two-Arm Lift Simulation	Successful Lifts0.00e+0	4

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord