Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Solving Minimum-Cost Reach Avoid using Reinforcement Learning

About

Current reinforcement-learning methods are unable to directly learn policies that solve the minimum cost reach-avoid problem to minimize cumulative costs subject to the constraints of reaching the goal and avoiding unsafe states, as the structure of this new optimization problem is incompatible with current methods. Instead, a surrogate problem is solved where all objectives are combined with a weighted sum. However, this surrogate objective results in suboptimal policies that do not directly minimize the cumulative cost. In this work, we propose RC-PPO, a reinforcement-learning-based method for solving the minimum-cost reach-avoid problem by using connections to Hamilton-Jacobi reachability. Empirical results demonstrate that RC-PPO learns policies with comparable goal-reaching rates to while achieving up to 57% lower cumulative costs compared to existing methods on a suite of minimum-cost reach-avoid benchmarks on the Mujoco simulator. The project page can be found at https://oswinso.xyz/rcppo.

Oswin So, Cheng Ge, Chuchu Fan• 2024

Related benchmarks

TaskDatasetResultRank
Sudoku SolvingSudoku 2x2
Final Reward0.3
14
Graph ColoringGraph Coloring G1
Final Reward-1.4
7
Graph ColoringGraph Coloring G2
Final Reward-3.1
7
Graph ColoringGraph Coloring G3
Final Reward-3.1
7
Graph ColoringGraph Coloring G4
Final Reward-2.8
7
N-Queens ProblemN-Queens N=4
Final Reward-0.3
7
N-Queens ProblemN-Queens N=6
Final Reward-0.8
7
N-Queens ProblemN-Queens N=8
Final Reward-1.3
7
N-Queens ProblemN-Queens N=10
Final Reward-1.7
7
Sudoku SolvingSudoku 3x3
Final Reward-230
7
Showing 10 of 15 rows

Other info

Follow for update