For SALE: State-Action Representation Learning for Deep Reinforcement Learning

About

In the field of reinforcement learning (RL), representation learning is a proven tool for complex image-based tasks, but is often overlooked for environments with low-level states, such as physical control problems. This paper introduces SALE, a novel approach for learning embeddings that model the nuanced interaction between state and action, enabling effective representation learning from low-level states. We extensively study the design space of these embeddings and highlight important design considerations. We integrate SALE and an adaptation of checkpoints for RL into TD3 to form the TD7 algorithm, which significantly outperforms existing continuous control algorithms. On OpenAI gym benchmark tasks, TD7 has an average performance gain of 276.7% and 50.7% over TD3 at 300k and 5M time steps, respectively, and works in both the online and offline settings.

Scott Fujimoto, Wei-Di Chang, Edward J. Smith, Shixiang Shane Gu, Doina Precup, David Meger• 2023

Related benchmarks

Task	Dataset	Result
Continuous Control	MuJoCo Walker2d v4	--	51
Continuous Control	MuJoCo Ant v4	Average Return8.51e+3	46
Continuous Control	MuJoCo HalfCheetah v4	Average Return1.74e+4	36
Reinforcement Learning	MuJoCo HalfCheetah v2	Average Return1.82e+4	18
Reinforcement Learning	MuJoCo Ant v2	Average Return1.01e+4	18
Reinforcement Learning	MuJoCo Hopper v2	Average Return4.08e+3	18
Reinforcement Learning	MuJoCo Walker2d v2	Average Return7.40e+3	18
Reinforcement Learning	MuJoCo Humanoid v2	Average Return1.03e+4	18
Reinforcement Learning	Gym MuJoCo	Normalized Mean Return1.02	18
Reinforcement Learning	DMC Hard	Normalized Mean Return1.02	18

Showing 10 of 27 rows

Other info

Follow for update

@wizwand_team Discord