Tactics of Adversarial Attack on Deep Reinforcement Learning Agents

About

We introduce two tactics to attack agents trained by deep reinforcement learning algorithms using adversarial examples, namely the strategically-timed attack and the enchanting attack. In the strategically-timed attack, the adversary aims at minimizing the agent's reward by only attacking the agent at a small subset of time steps in an episode. Limiting the attack activity to this subset helps prevent detection of the attack by the agent. We propose a novel method to determine when an adversarial example should be crafted and applied. In the enchanting attack, the adversary aims at luring the agent to a designated target state. This is achieved by combining a generative model and a planning algorithm: while the generative model predicts the future states, the planning algorithm generates a preferred sequence of actions for luring the agent. A sequence of adversarial examples is then crafted to lure the agent to take the preferred sequence of actions. We apply the two tactics to the agents trained by the state-of-the-art deep reinforcement learning algorithm including DQN and A3C. In 5 Atari games, our strategically timed attack reduces as much reward as the uniform attack (i.e., attacking at every time step) does by attacking the agent 4 times less often. Our enchanting attack lures the agent toward designated target states with a more than 70% success rate. Videos are available at http://yenchenlin.me/adversarial_attack_RL/

Yen-Chen Lin, Zhang-Wei Hong, Yuan-Hong Liao, Meng-Li Shih, Ming-Yu Liu, Min Sun• 2017

Related benchmarks

Task	Dataset	Result
Adversarial Attack	Seaquest	Cumulative Reward328.2	80
Cumulative Reward	Space Invaders	Cumulative Reward219.3	80
Adversarial Attack	Pong	Cumulative Reward20.01	80
Cumulative Reward	Qbert	Cumulative Reward569.4	80
Adversarial Attack	Breakout White-box discrete (test)	Cumulative Reward58.47	36
Adversarial Attack	Breakout Black-box discrete (test)	Cumulative Reward70.24	36
Multi-Agent Reinforcement Learning	TrafficJunction-Large (TJ-L)	Reward-24.6	24
Navigation	Navigation	Task Metric Value2.89	24
Multi-Agent Reinforcement Learning	Navigation	Reward1.36	24
Multi-Agent Reinforcement Learning	TrafficJunction Small (TJ-S)	Reward-1.69	24

Showing 10 of 31 rows

Other info

Follow for update

@wizwand_team Discord