Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LexiSafe: Offline Safe Reinforcement Learning with Lexicographic Safety-Reward Hierarchy

About

Offline safe reinforcement learning (RL) is increasingly important for cyber-physical systems (CPS), where safety violations during training are unacceptable and only pre-collected data are available. Existing offline safe RL methods typically balance reward-safety tradeoffs through constraint relaxation or joint optimization, but they often lack structural mechanisms to prevent safety drift. We propose LexiSafe, a lexicographic offline RL framework designed to preserve safety-aligned behavior. We first develop LexiSafe-SC, a single-cost formulation for standard offline safe RL, and derive safety-violation and performance-suboptimality bounds that together yield sample-complexity guarantees. We then extend the framework to hierarchical safety requirements with LexiSafe-MC, which supports multiple safety costs and admits its own sample-complexity analysis. Empirically, LexiSafe demonstrates reduced safety violations and improved task performance compared to constrained offline baselines. By unifying lexicographic prioritization with structural bias, LexiSafe offers a practical and theoretically grounded approach for safety-critical CPS decision-making.

Hsin-Jung Yang, Zhanhong Jiang, Prajwal Koirala, Qisai Liu, Cody Fleming, Soumik Sarkar• 2026

Related benchmarks

TaskDatasetResultRank
Reinforcement LearningSafety Gym HopperVel
Reward0.7
6
Reinforcement LearningBullet Safety Gym CarRun
Reward0.98
6
Reinforcement LearningBullet Safety Gym BallCircle
Reward0.71
6
Reinforcement LearningBullet Safety Gym AntCircle
Reward0.51
6
Reinforcement LearningSafety Gym HalfCheetahVel
Reward0.97
6
Reinforcement LearningSafety Gym Walker2dVel
Reward0.78
6
Reinforcement LearningBullet Safety Gym AntRun
Reward0.65
6
Reinforcement LearningSafety Gym SwimmerVel
Reward0.51
6
Reinforcement LearningBullet Safety Gym CarCircle
Reward0.71
6
Reinforcement LearningBullet Safety Gym DroneCircle
Reward0.51
6
Showing 10 of 12 rows

Other info

Follow for update