Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

OGBench: Benchmarking Offline Goal-Conditioned RL

About

Offline goal-conditioned reinforcement learning (GCRL) is a major problem in reinforcement learning (RL) because it provides a simple, unsupervised, and domain-agnostic way to acquire diverse behaviors and representations from unlabeled data without rewards. Despite the importance of this setting, we lack a standard benchmark that can systematically evaluate the capabilities of offline GCRL algorithms. In this work, we propose OGBench, a new, high-quality benchmark for algorithms research in offline goal-conditioned RL. OGBench consists of 8 types of environments, 85 datasets, and reference implementations of 6 representative offline GCRL algorithms. We have designed these challenging and realistic environments and datasets to directly probe different capabilities of algorithms, such as stitching, long-horizon reasoning, and the ability to handle high-dimensional inputs and stochasticity. While representative algorithms may rank similarly on prior benchmarks, our experiments reveal stark strengths and weaknesses in these different capabilities, providing a strong foundation for building new algorithms. Project page: https://seohong.me/projects/ogbench

Seohong Park, Kevin Frans, Benjamin Eysenbach, Sergey Levine• 2024

Related benchmarks

TaskDatasetResultRank
antmaze-medium-navigateOGBench 100% offline dataset
Success Rate95
12
cube-single-playOGBench 100% offline dataset
Success Rate0.68
12
scene-playOGBench 100% offline dataset
Success Rate51
12
antsoccer-medium-navigateOGBench 100% offline dataset
Success Rate7
12
antsoccer-arena-navigateOGBench 100% offline
Success Rate50
12
Robotic PlanningOGBench Scene 48 (play)
Success Rate0.42
8
Robotic PlanningOGBench PointMaze Giant 48 (stitch)
Success Rate0.00e+0
8
Robotic PlanningOGBench AntMaze Giant 48 (stitch)
Success Rate0.00e+0
8
Goal-conditioned manipulationOGBench cube-single-play v0
Task 1 Success Rate5.7
7
Showing 9 of 9 rows

Other info

Follow for update