Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

On the Robustness of Cooperative Multi-Agent Reinforcement Learning

About

In cooperative multi-agent reinforcement learning (c-MARL), agents learn to cooperatively take actions as a team to maximize a total team reward. We analyze the robustness of c-MARL to adversaries capable of attacking one of the agents on a team. Through the ability to manipulate this agent's observations, the adversary seeks to decrease the total team reward. Attacking c-MARL is challenging for three reasons: first, it is difficult to estimate team rewards or how they are impacted by an agent mispredicting; second, models are non-differentiable; and third, the feature space is low-dimensional. Thus, we introduce a novel attack. The attacker first trains a policy network with reinforcement learning to find a wrong action it should encourage the victim agent to take. Then, the adversary uses targeted adversarial examples to force the victim to take this action. Our results on the StartCraft II multi-agent benchmark demonstrate that c-MARL teams are highly vulnerable to perturbations applied to one of their agent's observations. By attacking a single agent, our attack method has highly negative impact on the overall team reward, reducing it from 20 to 9.4. This results in the team's winning rate to go down from 98.9% to 0%.

Jieyu Lin, Kristina Dzeparoska, Sai Qian Zhang, Alberto Leon-Garcia, Nicolas Papernot• 2020

Related benchmarks

TaskDatasetResultRank
Adversarial AttackSMAC 1c3s5z
Reward14.8
12
Adversarial AttackMPE spread
Reward Score-953.3
12
Adversarial AttackSMAC 8m
Reward15.08
12
Adversarial AttackSMAC bane_vs_bane
Reward16.02
12
Adversarial AttackSMAC 27m_vs_30m
Reward16.78
12
Adversarial AttackGoogle Research Football counterattack
Reward1.27
12
Adversarial AttackGoogle Research Football 3 vs 1
Reward1.75
12
Adversarial AttackMulti-Agent Particle Environment reference
Reward-33.74
12
Attack DetectionSMAC 1c3s5z
F1 Score66
5
Attack DetectionSMAC 8m
F1 Score69
5
Showing 10 of 22 rows

Other info

Follow for update