Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

About

To achieve general intelligence, agents must learn how to interact with others in a shared environment: this is the challenge of multiagent reinforcement learning (MARL). The simplest form is independent reinforcement learning (InRL), where each agent treats its experience as part of its (non-stationary) environment. In this paper, we first observe that policies learned using InRL can overfit to the other agents' policies during training, failing to sufficiently generalize during execution. We introduce a new metric, joint-policy correlation, to quantify this effect. We describe an algorithm for general MARL, based on approximate best responses to mixtures of policies generated using deep reinforcement learning, and empirical game-theoretic analysis to compute meta-strategies for policy selection. The algorithm generalizes previous ones such as InRL, iterated best response, double oracle, and fictitious play. Then, we present a scalable implementation which reduces the memory requirement using decoupled meta-solvers. Finally, we demonstrate the generality of the resulting policies in two partially observable settings: gridworld coordination games and poker.

Marc Lanctot, Vinicius Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Perolat, David Silver, Thore Graepel• 2017

Related benchmarks

TaskDatasetResultRank
Two-Player Zero-Sum Game SolvingGoofspiel 13 cards
Estimated PE66.7
112
Competitive Game Strategy OptimizationRPS 1000D
Final KL Divergence0.0011
7
Pursuit-EvasionGrid Map
Success Rate100
7
Pursuit-EvasionScotland-Yard Map
Success Rate100
7
Pursuit-EvasionHollywood Walk of Fame
Success Rate95
7
Pursuit-EvasionThe Bund
Success Rate95
7
Coordination Game Strategy OptimizationBattle of Sexes
P1 Strategy Profile0.67
7
Pursuit-EvasionDowntown Map
Success Rate99
7
Pursuit-EvasionSagrada Familia
Success Rate93
7
Pursuit-EvasionBig Ben
Success Rate99
7
Showing 10 of 16 rows

Other info

Follow for update