Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Look where you look! Saliency-guided Q-networks for generalization in visual Reinforcement Learning

About

Deep reinforcement learning policies, despite their outstanding efficiency in simulated visual control tasks, have shown disappointing ability to generalize across disturbances in the input training images. Changes in image statistics or distracting background elements are pitfalls that prevent generalization and real-world applicability of such control policies. We elaborate on the intuition that a good visual policy should be able to identify which pixels are important for its decision, and preserve this identification of important sources of information across images. This implies that training of a policy with small generalization gap should focus on such important pixels and ignore the others. This leads to the introduction of saliency-guided Q-networks (SGQN), a generic method for visual reinforcement learning, that is compatible with any value function learning method. SGQN vastly improves the generalization capability of Soft Actor-Critic agents and outperforms existing stateof-the-art methods on the Deepmind Control Generalization benchmark, setting a new reference in terms of training efficiency, generalization gap, and policy interpretability.

David Bertoin, Adil Zouitine, Mehdi Zouitine, Emmanuel Rachelson• 2022

Related benchmarks

TaskDatasetResultRank
Continuous ControlDMC-GB video hard
Cartpole Swingup Score5.44e+4
18
Continuous ControlDMC-GB video easy
Cartpole Swingup Score717
12
Finger SpinDMControl Novel view (test)
Reward553.3
12
Cup CatchDMControl Novel view (test)
Reward803
12
Robotic manipulation (Reach)Robotic-Manipulation reach (test2)
Performance33
7
Cheetah RunDMControl-GB color-easy (test)
Average Episode Return312
7
Robotic Manipulationpeg-in-box (test2)
Return194
7
Robotic Manipulationpeg-in-box (test3)
Return198
7
Robotic manipulation (Reach)Robotic-Manipulation reach (train)
Performance33
7
Walker WalkDMControl-GB color-easy (test)
Avg Episode Return805
7
Showing 10 of 46 rows

Other info

Code

Follow for update