Fuz-RL: A Fuzzy-Guided Robust Framework for Safe Reinforcement Learning under Uncertainty

About

Safe Reinforcement Learning (RL) is crucial for achieving high performance while ensuring safety in real-world applications. However, the complex interplay of multiple uncertainty sources in real environments poses significant challenges for interpretable risk assessment and robust decision-making. To address these challenges, we propose Fuz-RL, a fuzzy measure-guided robust framework for safe RL. Specifically, our framework develops a novel fuzzy Bellman operator for estimating robust value functions using Choquet integrals. Theoretically, we prove that solving the Fuz-RL problem (in Constrained Markov Decision Process (CMDP) form) is equivalent to solving distributionally robust safe RL problems (in robust CMDP form), effectively avoiding min-max optimization. Empirical analyses on safe-control-gym and safety-gymnasium scenarios demonstrate that Fuz-RL effectively integrates with existing safe RL baselines in a model-free manner, significantly improving both safety and control performance under various types of uncertainties in observation, action, and dynamics.

Xu Wan, Chao Yang, Cheng Yang, Jie Song, Mingyang Sun• 2026

Related benchmarks

Task	Dataset	Result
Stabilization	Safe-Control-Gym Cartpole Stab Observation Uncertainty	Average Return59	7
Stabilization	Safe-Control-Gym Cartpole Stab Action Uncertainty	Average Return102	7
Stabilization	Safe-Control-Gym Cartpole Stab Dynamics Uncertainty	Average Return93	7
Stabilization	Safe-Control-Gym Quadrotor Stab Action Uncertainty	Average Return94	7
Stabilization	Safe-Control-Gym Quadrotor Stab Dynamics Uncertainty	Average Return156	7
Tracking	Safe-Control-Gym Cartpole Track Observation Uncertainty	Average Return93	7
Tracking	Safe-Control-Gym Cartpole Track Action Uncertainty	Avg Return120	7
Tracking	Safe-Control-Gym Cartpole Track Dynamics Uncertainty	Average Return112	7
Tracking	Safe-Control-Gym Quadrotor Track Action Uncertainty	Average Return112	7
Stabilization	Safe-Control-Gym Quadrotor Stab Observation Uncertainty	Average Return161	7

Showing 10 of 14 rows

Other info

Follow for update

@wizwand_team Discord