Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Message-Dropout: An Efficient Training Method for Multi-Agent Deep Reinforcement Learning

About

In this paper, we propose a new learning technique named message-dropout to improve the performance for multi-agent deep reinforcement learning under two application scenarios: 1) classical multi-agent reinforcement learning with direct message communication among agents and 2) centralized training with decentralized execution. In the first application scenario of multi-agent systems in which direct message communication among agents is allowed, the message-dropout technique drops out the received messages from other agents in a block-wise manner with a certain probability in the training phase and compensates for this effect by multiplying the weights of the dropped-out block units with a correction probability. The applied message-dropout technique effectively handles the increased input dimension in multi-agent reinforcement learning with communication and makes learning robust against communication errors in the execution phase. In the second application scenario of centralized training with decentralized execution, we particularly consider the application of the proposed message-dropout to Multi-Agent Deep Deterministic Policy Gradient (MADDPG), which uses a centralized critic to train a decentralized actor for each agent. We evaluate the proposed message-dropout technique for several games, and numerical results show that the proposed message-dropout technique with proper dropout rate improves the reinforcement learning performance significantly in terms of the training speed and the steady-state performance in the execution phase.

Woojun Kim, Myungsik Cho, Youngchul Sung• 2019

Related benchmarks

TaskDatasetResultRank
Multi-agent cooperationSimple_Tag 9 agents
Avg Reward77.8
42
Multi-agent cooperationSimple_Tag 3 agents
Average Reward72.7
42
Multi-agent cooperationSimple_Tag 6 agents
Average Reward84.2
42
Multi-Agent Reinforcement LearningSimple Adversary Heavy DBC
Avg Episode Reward-6.8
7
Multi-Agent Reinforcement LearningSimple_Tag Medium MBC (6)
Avg Episode Reward72.1
7
Multi-Agent Reinforcement LearningSimple_Tag Heavy MBC (8)
Avg Episode Reward72.7
7
Multi-Agent Reinforcement LearningSimple_Tag Light DBC (5)
Cumulative Reward70.8
7
Multi-Agent Reinforcement LearningSimple_Tag Medium DBC (3)
Cumulative Reward70.7
7
Multi-Agent Reinforcement LearningSimple Tag Heavy DBC
Avg Episode Reward71.4
7
Multi-Agent Reinforcement LearningSimple_Spread Unrestricted
Cumulative Reward-137.5
7
Showing 10 of 16 rows

Other info

Follow for update