Risk-Averse Model Uncertainty for Distributionally Robust Safe Reinforcement Learning

About

Many real-world domains require safe decision making in uncertain environments. In this work, we introduce a deep reinforcement learning framework for approaching this important problem. We consider a distribution over transition models, and apply a risk-averse perspective towards model uncertainty through the use of coherent distortion risk measures. We provide robustness guarantees for this framework by showing it is equivalent to a specific class of distributionally robust safe reinforcement learning problems. Unlike existing approaches to robustness in deep reinforcement learning, however, our formulation does not involve minimax optimization. This leads to an efficient, model-free implementation of our approach that only requires standard data collection from a single training environment. In experiments on continuous control tasks with safety constraints, we demonstrate that our framework produces robust performance and safety at deployment time across a range of perturbed test environments.

James Queeney, Mouhacine Benosman• 2023

Related benchmarks

Task	Dataset	Result
Stabilization	Safe-Control-Gym Cartpole Stab Dynamics Uncertainty	Average Return86	7
Tracking	Safe-Control-Gym Quadrotor Track Observation Uncertainty	Average Return176	7
Stabilization	Safe-Control-Gym Quadrotor Stab Observation Uncertainty	Average Return146	7
Stabilization	Safe-Control-Gym Cartpole Stab Action Uncertainty	Average Return86	7
Stabilization	Safe-Control-Gym Quadrotor Stab Dynamics Uncertainty	Average Return120	7
Tracking	Safe-Control-Gym Cartpole Track Observation Uncertainty	Average Return67	7
Stabilization	Safe-Control-Gym Cartpole Stab Observation Uncertainty	Average Return35	7
Tracking	Safe-Control-Gym Cartpole Track Dynamics Uncertainty	Average Return79	7
Stabilization	Safe-Control-Gym Quadrotor Stab Action Uncertainty	Average Return28	7
Tracking	Safe-Control-Gym Cartpole Track Action Uncertainty	Avg Return70	7

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord