Pre-Trained Image Encoder for Generalizable Visual Reinforcement Learning

About

Learning generalizable policies that can adapt to unseen environments remains challenging in visual Reinforcement Learning (RL). Existing approaches try to acquire a robust representation via diversifying the appearances of in-domain observations for better generalization. Limited by the specific observations of the environment, these methods ignore the possibility of exploring diverse real-world image datasets. In this paper, we investigate how a visual RL agent would benefit from the off-the-shelf visual representations. Surprisingly, we find that the early layers in an ImageNet pre-trained ResNet model could provide rather generalizable representations for visual RL. Hence, we propose Pre-trained Image Encoder for Generalizable visual reinforcement learning (PIE-G), a simple yet effective framework that can generalize to the unseen visual scenarios in a zero-shot manner. Extensive experiments are conducted on DMControl Generalization Benchmark, DMControl Manipulation Tasks, Drawer World, and CARLA to verify the effectiveness of PIE-G. Empirical evidence suggests PIE-G improves sample efficiency and significantly outperforms previous state-of-the-art methods in terms of generalization performance. In particular, PIE-G boasts a 55% generalization performance gain on average in the challenging video background setting. Project Page: https://sites.google.com/view/pie-g/home.

Zhecheng Yuan, Zhengrong Xue, Bo Yuan, Xueqian Wang, Yi Wu, Yang Gao, Huazhe Xu• 2022

Related benchmarks

Task	Dataset	Result
Continuous Control	DMC-GB video hard	Cartpole Swingup Score401	18
Cup Catch	DMControl Novel view (test)	Reward927.3	12
Finger Spin	DMControl Novel view (test)	Reward755.9	12
Continuous Control	DMC-GB video easy	Cartpole Swingup Score587	12
Ball In Cup Catch	DMC-GB color-jittered (test)	Average Return964	6
Finger Spin	DMControl Shaking view (test)	Reward551.8	6
Manipulation	DeepMind Manipulation tasks Modified Arm	Average Return122	6
Manipulation	DeepMind Manipulation tasks Modified Platform	Average Return96	6
Manipulation	DeepMind Manipulation tasks Modified Both	Average Return44	6
Walker Stand	DMC-GB color-jittered (test)	Average Return960	6

Showing 10 of 30 rows

Other info

Code

Follow for update

@wizwand_team Discord