Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Optimal Goal-Reaching Reinforcement Learning via Quasimetric Learning

About

In goal-reaching reinforcement learning (RL), the optimal value function has a particular geometry, called quasimetric structure. This paper introduces Quasimetric Reinforcement Learning (QRL), a new RL method that utilizes quasimetric models to learn optimal value functions. Distinct from prior approaches, the QRL objective is specifically designed for quasimetrics, and provides strong theoretical recovery guarantees. Empirically, we conduct thorough analyses on a discretized MountainCar environment, identifying properties of QRL and its advantages over alternatives. On offline and online goal-reaching benchmarks, QRL also demonstrates improved sample efficiency and performance, across both state-based and image-based observations.

Tongzhou Wang, Antonio Torralba, Phillip Isola, Amy Zhang• 2023

Related benchmarks

TaskDatasetResultRank
Offline Reinforcement Learningpuzzle-4x4-play OGBench 5 tasks v0
Average Success Rate0.00e+0
18
Goal-conditioned Reinforcement Learningpointmaze navigate medium
Success Rate83
11
Goal-conditioned Reinforcement Learningmanipulation-cube-single-play (test)
Success Rate0.11
11
task5humanoidmaze giant
Success Rate800
10
task5puzzle 4x6
Success Rate0.00e+0
10
Overallhumanoidmaze giant
Success Rate3
10
task1humanoidmaze giant
Success Rate1
10
task3puzzle 4x6
Success Rate0.00e+0
10
Overallpuzzle 4x5
Success Rate0.00e+0
10
Overallpuzzle 4x6
Success Rate0.00e+0
10
Showing 10 of 67 rows

Other info

Follow for update