When Actions Disappear: Adversarial Action Removal in Self-Play Reinforcement Learning

About

We study adversarial action masking in self-play reinforcement learning: an attacker selectively removes legal actions from a victim's action set. Unlike observation or action perturbations, removal eliminates decision options before the agent acts. Across poker games scaling from 6 to 5,531 information states and two non-poker domains, learned masking causes substantially more damage than random masking and learned perturbation baselines. The attack persists across Q-learning, PPO, NFSP, neural NFSP, and DQN victims; transfers across agents; is amplified by self-play; and shows no recovery under extended masked training. Mechanistically, the adversary targets high-value decision points, captured by reach-weighted contingent action capacity (CAC$_w$) and a value-weighted refinement CAC$_v$. These results identify action availability as a distinct robustness surface in self-play RL.

Arahan Kujur• 2026

Related benchmarks

Task	Dataset	Result
Game Playing	Kuhn Poker	Raw Reward0.98	6
Game Playing	Leduc Poker	Raw Reward3.5	6
Competitive multi-agent gridworld game	Competitive Gridworld 5x5, 149 P0 states	P0 Reward0.58	4
Poker	Leduc standard (test)	--	1
Poker	Leduc-5 standard (test)	--	1
Poker	Leduc-10 standard (test)	--	1
Poker	Leduc-20 standard (test)	--	1

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord