Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Graph-Assisted Stitching for Offline Hierarchical Reinforcement Learning

About

Existing offline hierarchical reinforcement learning methods rely on high-level policy learning to generate subgoal sequences. However, their efficiency degrades as task horizons increase, and they lack effective strategies for stitching useful state transitions across different trajectories. We propose Graph-Assisted Stitching (GAS), a novel framework that formulates subgoal selection as a graph search problem rather than learning an explicit high-level policy. By embedding states into a Temporal Distance Representation (TDR) space, GAS clusters semantically similar states from different trajectories into unified graph nodes, enabling efficient transition stitching. A shortest-path algorithm is then applied to select subgoal sequences within the graph, while a low-level policy learns to reach the subgoals. To improve graph quality, we introduce the Temporal Efficiency (TE) metric, which filters out noisy or inefficient transition states, significantly enhancing task performance. GAS outperforms prior offline HRL methods across locomotion, navigation, and manipulation tasks. Notably, in the most stitching-critical task, it achieves a score of 88.3, dramatically surpassing the previous state-of-the-art score of 1.0. Our source code is available at: https://github.com/qortmdgh4141/GAS.

Seungho Baek, Taegeon Park, Jongchan Park, Seungjun Oh, Yusung Kim• 2025

Related benchmarks

TaskDatasetResultRank
Goal-conditioned Reinforcement LearningOGBench antmaze-medium-stitch v0
Success Rate98.1
12
Goal-conditioned Reinforcement LearningOGBench antmaze-large-stitch v0
Success Rate96.3
12
Goal-conditioned Reinforcement LearningOGBench antmaze-giant-stitch v0
Success Rate86.2
12
Goal-conditioned Reinforcement LearningOGBench humanoidmaze-medium-stitch v0
Success Rate96.2
12
Goal-conditioned Reinforcement LearningOGBench humanoidmaze-large-stitch v0
Success Rate80.6
12
Goal-conditioned Reinforcement LearningOGBench antmaze-large-explore v0
Success Rate91
12
Goal-conditioned Reinforcement LearningOGBench humanoidmaze-giant-stitch v0
Success Rate82.4
12
Goal-conditioned Reinforcement LearningOGBench antmaze-large-navigate v0
Success Rate93.2
11
Goal-conditioned Reinforcement LearningOGBench antmaze-giant-navigate v0
Success Rate76
11
Goal-conditioned Reinforcement LearningOGBench humanoidmaze-medium-navigate v0
Success Rate96.3
11
Showing 10 of 24 rows

Other info

Follow for update