Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DSAC: Distributional Soft Actor-Critic for Risk-Sensitive Reinforcement Learning

About

We present Distributional Soft Actor-Critic (DSAC), a distributional reinforcement learning (RL) algorithm that combines the strengths of distributional information of accumulated rewards and entropy-driven exploration from Soft Actor-Critic (SAC) algorithm. DSAC models the randomness in both action and rewards, surpassing baseline performances on various continuous control tasks. Unlike standard approaches that solely maximize expected rewards, we propose a unified framework for risk-sensitive learning, one that optimizes the risk-related objective while balancing entropy to encourage exploration. Extensive experiments demonstrate DSAC's effectiveness in enhancing agent performances for both risk-neutral and risk-sensitive control tasks.

Xiaoteng Ma, Junyao Chen, Li Xia, Jun Yang, Qianchuan Zhao, Zhengyuan Zhou• 2020

Related benchmarks

TaskDatasetResultRank
Continuous ControlDeepMind Control Suite (DMC)
Cheetah Run753
15
Continuous ControlMuJoCo v5
Ant Score776
15
Continuous ControlMujoco
Ant-v5776
9
Continuous ControlDeepMind Control Suite Vision Cheetah-Run (test)
AULC770.5
5
Continuous ControlDMC Vision Finger-Turn Hard (test)
AULC661.1
5
Continuous ControlDeepMind Control Suite Vision Quadruped-Run (test)
AULC550.2
5
Continuous ControlDMC Vision Reacher-Hard (test)
AULC773.1
5
Robot navigationRisky PointMass (test)
Mean Return-7.69
5
Continuous ControlDMC Vision Walker-Run (test)
AULC509.5
5
Continuous ControlDMC
Cheetah-run Score753
5
Showing 10 of 25 rows

Other info

Follow for update