Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning

About

We present DrQ-v2, a model-free reinforcement learning (RL) algorithm for visual continuous control. DrQ-v2 builds on DrQ, an off-policy actor-critic approach that uses data augmentation to learn directly from pixels. We introduce several improvements that yield state-of-the-art results on the DeepMind Control Suite. Notably, DrQ-v2 is able to solve complex humanoid locomotion tasks directly from pixel observations, previously unattained by model-free RL. DrQ-v2 is conceptually simple, easy to implement, and provides significantly better computational footprint compared to prior work, with the majority of tasks taking just 8 hours to train on a single GPU. Finally, we publicly release DrQ-v2's implementation to provide RL practitioners with a strong and computationally efficient baseline.

Denis Yarats, Rob Fergus, Alessandro Lazaric, Lerrel Pinto• 2021

Related benchmarks

Task	Dataset	Result
Continuous Control	DMC-GB video hard	Cartpole Swingup Score130	18
Autonomous Driving	NoCrash Town02	Return1.02e+3	15
Finger Spin	DMControl Novel view (test)	Reward793.7	12
Cup Catch	DMControl Novel view (test)	Reward919.8	12
Continuous Control	DMC-GB video easy	Cartpole Swingup Score267	12
Autonomous Driving	CARLA Leaderboard	Return1.51e+3	9
Autonomous Driving	NoCrash (Town01)	Return1.66e+3	8
Visual Reinforcement Learning	DMControl VDCS Markov-temporal perturbations (test)	Cartpole Swingup Score49	8
LiftPegUpright	ManiSkill3 Medium Ground Texture	Success Rate42	7
LiftPegUpright	ManiSkill3 Easy Lighting Direction v1 (test)	Success Rate46	7

Showing 10 of 281 rows

...

Other info

Follow for update

@wizwand_team Discord