Multiagent Cooperation and Competition with Deep Reinforcement Learning

About

Multiagent systems appear in most social, economical, and political situations. In the present work we extend the Deep Q-Learning Network architecture proposed by Google DeepMind to multiagent environments and investigate how two agents controlled by independent Deep Q-Networks interact in the classic videogame Pong. By manipulating the classical rewarding scheme of Pong we demonstrate how competitive and collaborative behaviors emerge. Competitive agents learn to play and score efficiently. Agents trained under collaborative rewarding schemes find an optimal strategy to keep the ball in the game as long as possible. We also describe the progression from competitive to collaborative behavior. The present work demonstrates that Deep Q-Networks can become a practical tool for studying the decentralized learning of multiagent systems living in highly complex environments.

Ardi Tampuu, Tambet Matiisen, Dorian Kodelja, Ilya Kuzovkin, Kristjan Korjus, Juhan Aru, Jaan Aru, Raul Vicente• 2015

Related benchmarks

Task	Dataset	Result
Multi-Agent Reinforcement Learning	Level-Based Foraging 10x10-4p-3f v2 (test)	Final Episode Return37	10
Multi-Agent Reinforcement Learning	Level-Based Foraging 2s-10x10-3p-3f v2 (test)	Final Episode Return56	10
Multi-Agent Reinforcement Learning	Level-Based Foraging 10x10-3p-5f v2 (test)	Final Episode Return11	10
Multi-Agent Reinforcement Learning	Level-Based Foraging 2s-8x8-2p-2f-coop v2 (test)	Final Episode Return65	10
Multi-Agent Reinforcement Learning	MAMuJoCo HalfCheetah 6x1 (test)	Average Episodic Return16.03	8
Multi-Agent Reinforcement Learning	MAMuJoCo Hopper 3x1 (test)	Average Episodic Return17.09	8
Multi-Agent Reinforcement Learning	MAMuJoCo Walker2d 6x1 (test)	Average Episodic Return18.61	8
Multi-Agent Reinforcement Learning	MAMuJoCo Ant 8x1 (test)	Average Episodic Return22.5	8
AUV target-tracking	AUV target-tracking simulator medium scenario, combined-stress setting	Tracking Distance Error (km)0.749	7
AUV target-tracking	AUV target-tracking simulator medium scenario, nominal setting	Tracking Error (km)0.68	7

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord