Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Smooth Exploration for Robotic Reinforcement Learning

About

Reinforcement learning (RL) enables robots to learn skills from interactions with the real world. In practice, the unstructured step-based exploration used in Deep RL -- often very successful in simulation -- leads to jerky motion patterns on real robots. Consequences of the resulting shaky behavior are poor exploration, or even damage to the robot. We address these issues by adapting state-dependent exploration (SDE) to current Deep RL algorithms. To enable this adaptation, we propose two extensions to the original SDE, using more general features and re-sampling the noise periodically, which leads to a new exploration method generalized state-dependent exploration (gSDE). We evaluate gSDE both in simulation, on PyBullet continuous control tasks, and directly on three different real robots: a tendon-driven elastic robot, a quadruped and an RC car. The noise sampling interval of gSDE permits to have a compromise between performance and smoothness, which allows training directly on the real robots without loss of performance. The code is available at https://github.com/DLR-RM/stable-baselines3.

Antonin Raffin, Jens Kober, Freek Stulp• 2020

Related benchmarks

TaskDatasetResultRank
LocomotionPyBullet Walker
Energy Consumption0.25
8
LocomotionPyBullet Humanoid
Energy Consumption0.11
8
LocomotionPyBullet Ant
Energy Consumption0.23
8
LocomotionPyBullet Hopper
Energy0.23
8
LocomotionPyBullet Half cheetah
Energy Consumption0.23
8
Elbow PoseMyoSuite
Energy0.18
4
Finger PoseMyoSuite
Energy0.02
4
BaodingMyoSuite (test)
Energy0.07
4
Hand reachMyoSuite (test)
Energy0.07
4
Finger reachMyoSuite Finger reach (N=5 seeds)
Energy0.07
4
Showing 10 of 13 rows

Other info

Code

Follow for update