Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning

About

We present DrQ-v2, a model-free reinforcement learning (RL) algorithm for visual continuous control. DrQ-v2 builds on DrQ, an off-policy actor-critic approach that uses data augmentation to learn directly from pixels. We introduce several improvements that yield state-of-the-art results on the DeepMind Control Suite. Notably, DrQ-v2 is able to solve complex humanoid locomotion tasks directly from pixel observations, previously unattained by model-free RL. DrQ-v2 is conceptually simple, easy to implement, and provides significantly better computational footprint compared to prior work, with the majority of tasks taking just 8 hours to train on a single GPU. Finally, we publicly release DrQ-v2's implementation to provide RL practitioners with a strong and computationally efficient baseline.

Denis Yarats, Rob Fergus, Alessandro Lazaric, Lerrel Pinto• 2021

Related benchmarks

TaskDatasetResultRank
Continuous ControlDMC-GB video hard
Cartpole Swingup Score130
18
Autonomous DrivingNoCrash Town02
Return1.02e+3
15
Finger SpinDMControl Novel view (test)
Reward793.7
12
Cup CatchDMControl Novel view (test)
Reward919.8
12
Continuous ControlDMC-GB video easy
Cartpole Swingup Score267
12
Autonomous DrivingCARLA Leaderboard
Return1.51e+3
9
Autonomous DrivingNoCrash (Town01)
Return1.66e+3
8
Continuous ControlDMControl
Point Mass Easy525
7
Robot ManipulationMeta-world v1 v2 (train test)
Basketball0.00e+0
7
ManipulationDeepMind Manipulation tasks (train)
Avg Return204
6
Showing 10 of 43 rows

Other info

Follow for update