OGBench: Benchmarking Offline Goal-Conditioned RL
About
Offline goal-conditioned reinforcement learning (GCRL) is a major problem in reinforcement learning (RL) because it provides a simple, unsupervised, and domain-agnostic way to acquire diverse behaviors and representations from unlabeled data without rewards. Despite the importance of this setting, we lack a standard benchmark that can systematically evaluate the capabilities of offline GCRL algorithms. In this work, we propose OGBench, a new, high-quality benchmark for algorithms research in offline goal-conditioned RL. OGBench consists of 8 types of environments, 85 datasets, and reference implementations of 6 representative offline GCRL algorithms. We have designed these challenging and realistic environments and datasets to directly probe different capabilities of algorithms, such as stitching, long-horizon reasoning, and the ability to handle high-dimensional inputs and stochasticity. While representative algorithms may rank similarly on prior benchmarks, our experiments reveal stark strengths and weaknesses in these different capabilities, providing a strong foundation for building new algorithms. Project page: https://seohong.me/projects/ogbench
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| antmaze-medium-navigate | OGBench 100% offline dataset | Success Rate95 | 12 | |
| cube-single-play | OGBench 100% offline dataset | Success Rate0.68 | 12 | |
| scene-play | OGBench 100% offline dataset | Success Rate51 | 12 | |
| antsoccer-medium-navigate | OGBench 100% offline dataset | Success Rate7 | 12 | |
| antsoccer-arena-navigate | OGBench 100% offline | Success Rate50 | 12 | |
| Robotic Planning | OGBench Scene 48 (play) | Success Rate0.42 | 8 | |
| Robotic Planning | OGBench PointMaze Giant 48 (stitch) | Success Rate0.00e+0 | 8 | |
| Robotic Planning | OGBench AntMaze Giant 48 (stitch) | Success Rate0.00e+0 | 8 | |
| Goal-conditioned manipulation | OGBench cube-single-play v0 | Task 1 Success Rate5.7 | 7 |