MaDi: Learning to Mask Distractions for Generalization in Visual Deep Reinforcement Learning

About

The visual world provides an abundance of information, but many input pixels received by agents often contain distracting stimuli. Autonomous agents need the ability to distinguish useful information from task-irrelevant perceptions, enabling them to generalize to unseen environments with new distractions. Existing works approach this problem using data augmentation or large auxiliary networks with additional loss functions. We introduce MaDi, a novel algorithm that learns to mask distractions by the reward signal only. In MaDi, the conventional actor-critic structure of deep reinforcement learning agents is complemented by a small third sibling, the Masker. This lightweight neural network generates a mask to determine what the actor and critic will receive, such that they can focus on learning the task. The masks are created dynamically, depending on the current input. We run experiments on the DeepMind Control Generalization Benchmark, the Distracting Control Suite, and a real UR5 Robotic Arm. Our algorithm improves the agent's focus with useful masks, while its efficient Masker network only adds 0.2% more parameters to the original structure, in contrast to previous work. MaDi consistently achieves generalization results better than or competitive to state-of-the-art methods.

Bram Grooten, Tristan Tomilin, Gautham Vasan, Matthew E. Taylor, A. Rupam Mahmood, Meng Fang, Mykola Pechenizkiy, Decebal Constantin Mocanu• 2023

Related benchmarks

Task	Dataset	Result
Continuous Control	DMControl 500k	Spin Score951	42
Continuous Control	DMControl 100k	DMControl: Finger Spin Score810	38
Reinforcement Learning	Procgen (test)	BigFish Return11.96	21
Visual Reinforcement Learning	DMControl Cheetah Run	Episode Return432	16
Visual Reinforcement Learning	DMControl Reacher Easy	Episode Return766	16
Visual Reinforcement Learning	DMControl Walker Walk	Episode Return574	16
Visual Reinforcement Learning	DMControl Ball in cup, Catch	Episode Return884	16
Visual Reinforcement Learning	DMControl Finger, Spin	Episode Return810	16
Visual Reinforcement Learning	DMControl Cartpole, Swingup	Episode Return704	16
Continuous Control	DeepMind Control (DMC) Suite (1M steps)	IQM56.6	14

Showing 10 of 247 rows

...

Other info

Follow for update

@wizwand_team Discord