DSAC: Distributional Soft Actor-Critic for Risk-Sensitive Reinforcement Learning

About

We present Distributional Soft Actor-Critic (DSAC), a distributional reinforcement learning (RL) algorithm that combines the strengths of distributional information of accumulated rewards and entropy-driven exploration from Soft Actor-Critic (SAC) algorithm. DSAC models the randomness in both action and rewards, surpassing baseline performances on various continuous control tasks. Unlike standard approaches that solely maximize expected rewards, we propose a unified framework for risk-sensitive learning, one that optimizes the risk-related objective while balancing entropy to encourage exploration. Extensive experiments demonstrate DSAC's effectiveness in enhancing agent performances for both risk-neutral and risk-sensitive control tasks.

Xiaoteng Ma, Junyao Chen, Li Xia, Jun Yang, Qianchuan Zhao, Zhengyuan Zhou• 2020

Related benchmarks

Task	Dataset	Result
Continuous Control	DeepMind Control Suite (DMC)	Cheetah Run753	15
Continuous Control	MuJoCo v5	Ant Score776	15
Continuous Control	Mujoco	Ant-v5776	9
Continuous Control	DeepMind Control Suite Vision Cheetah-Run (test)	AULC770.5	5
Continuous Control	DMC Vision Finger-Turn Hard (test)	AULC661.1	5
Continuous Control	DeepMind Control Suite Vision Quadruped-Run (test)	AULC550.2	5
Continuous Control	DMC Vision Reacher-Hard (test)	AULC773.1	5
Robot navigation	Risky PointMass (test)	Mean Return-7.69	5
Continuous Control	DMC Vision Walker-Run (test)	AULC509.5	5
Continuous Control	DMC	Cheetah-run Score753	5

Showing 10 of 25 rows

Other info

Follow for update

@wizwand_team Discord