When Actions Disappear: Adversarial Action Removal in Self-Play Reinforcement Learning
About
We study adversarial action masking in self-play reinforcement learning: an attacker selectively removes legal actions from a victim's action set. Unlike observation or action perturbations, removal eliminates decision options before the agent acts. Across poker games scaling from 6 to 5,531 information states and two non-poker domains, learned masking causes substantially more damage than random masking and learned perturbation baselines. The attack persists across Q-learning, PPO, NFSP, neural NFSP, and DQN victims; transfers across agents; is amplified by self-play; and shows no recovery under extended masked training. Mechanistically, the adversary targets high-value decision points, captured by reach-weighted contingent action capacity (CAC$_w$) and a value-weighted refinement CAC$_v$. These results identify action availability as a distinct robustness surface in self-play RL.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Game Playing | Kuhn Poker | Raw Reward0.98 | 6 | |
| Game Playing | Leduc Poker | Raw Reward3.5 | 6 | |
| Competitive multi-agent gridworld game | Competitive Gridworld 5x5, 149 P0 states | P0 Reward0.58 | 4 | |
| Poker | Leduc standard (test) | -- | 1 | |
| Poker | Leduc-5 standard (test) | -- | 1 | |
| Poker | Leduc-10 standard (test) | -- | 1 | |
| Poker | Leduc-20 standard (test) | -- | 1 |