Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Goal Reaching with Eikonal-Constrained Hierarchical Quasimetric Reinforcement Learning

About

Goal-Conditioned Reinforcement Learning (GCRL) mitigates the difficulty of reward design by framing tasks as goal reaching rather than maximizing hand-crafted reward signals. In this setting, the optimal goal-conditioned value function naturally forms a quasimetric, motivating Quasimetric RL (QRL), which constrains value learning to quasimetric mappings and enforces local consistency through discrete, trajectory-based constraints. We propose Eikonal-Constrained Quasimetric RL (Eik-QRL), a continuous-time reformulation of QRL based on the Eikonal Partial Differential Equation (PDE). This PDE-based structure makes Eik-QRL trajectory-free, requiring only sampled states and goals, while improving out-of-distribution generalization. We provide theoretical guarantees for Eik-QRL and identify limitations that arise under complex dynamics. To address these challenges, we introduce Eik-Hierarchical QRL (Eik-HiQRL), which integrates Eik-QRL into a hierarchical decomposition. Empirically, Eik-HiQRL achieves state-of-the-art performance in offline goal-conditioned navigation and yields consistent gains over QRL in manipulation tasks, matching temporal-difference methods.

Vittorio Giammarino, Ahmed H. Qureshi• 2025

Related benchmarks

TaskDatasetResultRank
Goal-conditioned Reinforcement Learningantmaze stitch large
Success Rate88
23
Goal-conditioned Reinforcement Learningantmaze stitch medium
Success Rate0.94
23
Goal-conditioned Reinforcement Learninghumanoidmaze stitch large
Success Rate63
14
Goal-conditioned Reinforcement Learningantsoccer stitch arena
Success Rate32
14
Goal-conditioned Reinforcement Learninghumanoidmaze stitch medium
Success Rate85
14
Goal-conditioned Reinforcement Learningmanipulation scene-play
Success Rate0.55
14
Goal-conditioned Reinforcement Learningpointmaze navigate medium
Success Rate99
11
Goal-conditioned Reinforcement Learningmanipulation cube-single-play
Success Rate12
11
Goal-conditioned Reinforcement Learningmanipulation-cube-single-play (test)
Success Rate0.12
11
Goal-conditioned Reinforcement Learningantsoccer-navigate-arena (test)
Success Rate61
5
Showing 10 of 37 rows

Other info

Follow for update