Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Optimal Goal-Reaching Reinforcement Learning via Quasimetric Learning

About

In goal-reaching reinforcement learning (RL), the optimal value function has a particular geometry, called quasimetric structure. This paper introduces Quasimetric Reinforcement Learning (QRL), a new RL method that utilizes quasimetric models to learn optimal value functions. Distinct from prior approaches, the QRL objective is specifically designed for quasimetrics, and provides strong theoretical recovery guarantees. Empirically, we conduct thorough analyses on a discretized MountainCar environment, identifying properties of QRL and its advantages over alternatives. On offline and online goal-reaching benchmarks, QRL also demonstrates improved sample efficiency and performance, across both state-based and image-based observations.

Tongzhou Wang, Antonio Torralba, Phillip Isola, Amy Zhang• 2023

Related benchmarks

TaskDatasetResultRank
Offline Reinforcement Learningpuzzle-4x4-play OGBench 5 tasks v0
Average Success Rate0.00e+0
28
Goal-conditioned manipulationOGBench puzzle-4x4-play
Score0.00e+0
24
Goal-conditioned Reinforcement Learningantmaze stitch medium
Success Rate59
23
Goal-conditioned Reinforcement Learningantmaze stitch large
Success Rate24
23
ManipulationOGBench cube-triple-play
Success Rate0.00e+0
19
Goal-oriented planningOGBench PointMaze Large v1 (stitch)
Success Rate90
14
Offline Goal-Conditioned Reinforcement Learningantmaze medium-navigate v0
Success Rate88
14
Goal-conditioned locomotionOGBench PointMaze-Stitch Giant
Success Rate50
14
Goal-conditioned Reinforcement Learningmanipulation scene-play
Success Rate10
14
Offline Goal-Conditioned Reinforcement Learninghumanoidmaze large-navigate v0
Success Rate5
14
Showing 10 of 216 rows
...

Other info

Follow for update