Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

For SALE: State-Action Representation Learning for Deep Reinforcement Learning

About

In the field of reinforcement learning (RL), representation learning is a proven tool for complex image-based tasks, but is often overlooked for environments with low-level states, such as physical control problems. This paper introduces SALE, a novel approach for learning embeddings that model the nuanced interaction between state and action, enabling effective representation learning from low-level states. We extensively study the design space of these embeddings and highlight important design considerations. We integrate SALE and an adaptation of checkpoints for RL into TD3 to form the TD7 algorithm, which significantly outperforms existing continuous control algorithms. On OpenAI gym benchmark tasks, TD7 has an average performance gain of 276.7% and 50.7% over TD3 at 300k and 5M time steps, respectively, and works in both the online and offline settings.

Scott Fujimoto, Wei-Di Chang, Edward J. Smith, Shixiang Shane Gu, Doina Precup, David Meger• 2023

Related benchmarks

TaskDatasetResultRank
Continuous ControlMuJoCo Ant v4
Average Return8.51e+3
46
Continuous ControlMuJoCo Walker2d v4--
39
Continuous ControlMuJoCo HalfCheetah v4
Average Return1.74e+4
36
Reinforcement LearningMuJoCo HalfCheetah v2
Average Return1.82e+4
18
Reinforcement LearningMuJoCo Ant v2
Average Return1.01e+4
18
Reinforcement LearningMuJoCo Hopper v2
Average Return4.08e+3
18
Reinforcement LearningMuJoCo Walker2d v2
Average Return7.40e+3
18
Reinforcement LearningMuJoCo Humanoid v2
Average Return1.03e+4
18
Reinforcement LearningGym MuJoCo
Normalized Mean Return1.02
18
Reinforcement LearningDMC Hard
Normalized Mean Return1.02
18
Showing 10 of 23 rows

Other info

Follow for update