Reinforcement Learning for Risk Adaptation via Differentiable CVaR Barrier Functions
About
Planning through crowded environments under uncertain obstacle motions remains difficult, as stochastic interactions often induce overly conservative behavior or reduced efficiency. To address this challenge, we propose an end-to-end risk adaptation framework for crowd navigation under obstacle-motion uncertainty modeled by a Gaussian mixture model. The framework combines reinforcement learning~(RL) with a differentiable quadratic-program safety layer based on Conditional Value-at-Risk~(CVaR) barrier functions, jointly learning nominal control input, risk level, and safety margin and enforcing explicit probabilistic safety constraints. This design enables context-aware adaptation, promoting efficient behavior while invoking caution only when necessary. We conduct extensive evaluations in dynamic, uncertain, and crowded environments across varying obstacle densities and robot models, and further assess generalization under three out-of-distribution cases. Comparisons across optimization-based, RL-based, and integrated RL and optimization methods are provided, and the proposed method is shown to deliver the strongest overall performance in safety, efficiency, and generalization under uncertainty.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Robot navigation | 20-obstacle environment Single Integrator | Success Rate72.4 | 12 | |
| Robot navigation | 20-obstacle environment Unicycle | Success Rate (SR)66.2 | 12 | |
| Safe Robot Navigation | ORCA-based obstacle policy OOD Case I | Success Rate65.2 | 4 | |
| Safe Robot Navigation | OOD Case II: high obstacle density (30 obstacles) | SR44.2 | 4 | |
| Safe Robot Navigation | OOD Case III: increased obstacle radius (0.5 m) | Success Rate61.6 | 4 |