| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Continuous Control | MuJoCo HalfCheetah v4 | Normalized Performance107 | 18 | |
| Reinforcement Learning | MuJoCo HalfCheetah v2 | Average Return18,165 | 18 | |
| Continuous Control | MuJoCo HalfCheetah fixed random adversary L=0.1 | Average Return8,805 | 12 | |
| Continuous Control | MuJoCo HalfCheetah logarithmic adversary v1 | Avg Performance6,625 | 12 | |
| Continuous Control | MuJoCo HalfCheetah Vel (test) | Mean Return-48.4 | 9 | |
| Reinforcement Learning | MuJoCo HalfCheetah 1.5 density v1 (test) | Reward11,515 | 7 | |
| Continuous Control | MuJoCo HalfCheetah 10D-task (c) | Mean Return1,617 | 7 | |
| Continuous Control | MuJoCo HalfCheetah 10D-task (b) | Mean Return1,984 | 7 | |
| Continuous Control | MuJoCo HalfCheetah 10D-task (a) | Mean Return1,893 | 7 | |
| Continuous Control | MuJoCo HalfCheetah Body (test) | Mean Return1,655 | 7 | |
| Continuous Control | MuJoCo HalfCheetah Mass (test) | Mean Return1,726 | 7 | |
| Meta-Reinforcement Learning | MuJoCo HalfCheetah 10D-task (c) (test) | CVaR 0.05 Return1,024 | 7 | |
| Meta-Reinforcement Learning | MuJoCo HalfCheetah 10D-task (b) (test) | CVaR0.05 Return1,697 | 7 | |
| Meta-Reinforcement Learning | MuJoCo HalfCheetah 10D-task (a) (test) | CVaR 0.05 Return1,227 | 7 | |
| Meta-Reinforcement Learning | MuJoCo HalfCheetah Body variation (test) | CVaR 0.05 Return935 | 7 | |
| Meta-Reinforcement Learning | MuJoCo HalfCheetah Mass variation (test) | CVaR 0.05 Return1,259 | 7 | |
| Meta-Reinforcement Learning | MuJoCo HalfCheetah Velocity variation (test) | CVaR 0.05 Return-184 | 7 | |
| Continuous Control | MuJoCo HalfCheetah 10-p v4 | Normalized Return80.5 | 6 | |
| Continuous Control | MuJoCo HalfCheetah 2-p v4 | Normalized Return44.9 | 6 | |
| Continuous Control | MuJoCo HalfCheetah 4-p v4 | Normalized Return54.7 | 6 | |
| Reinforcement Learning | Delayed MuJoCo HalfCheetah v2 (test) | Average Return8,451.2 | 6 | |
| Reinforcement Learning | MuJoCo HalfCheetah Sparse v2 (test) | Max Average Return924.9 | 6 | |
| Reinforcement Learning | MuJoCo Halfcheetah epsilon=0.15 (test) | Natural Return5,048 | 5 | |
| Offline Inverse Reinforcement Learning | MuJoCo halfcheetah (medium-exp) | Average Reward12,174.61 | 5 | |
| Continuous Control | MuJoCo HalfCheetah v3 (1M steps) | Average Return11,914 | 5 |