| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Offline Reinforcement Learning | D4RL HalfCheetah Medium v2 | Average Normalized Return73.1 | 43 | |
| Locomotion | D4RL HalfCheetah medium-offline | Normalized Score34.97 | 36 | |
| Offline Reinforcement Learning | D4RL HalfCheetah Med-Replay v2 | Avg Normalized Return52.2 | 29 | |
| Offline Reinforcement Learning | D4RL HalfCheetah Medium | Reward54.1 | 28 | |
| Offline Reinforcement Learning | D4RL HalfCheetah Med-Expert v2 | Avg Normalized Return105.9 | 15 | |
| Offline Reinforcement Learning | D4RL halfcheetah medium v2 (test) | Normalized Reward48.3 | 8 | |
| Offline Reinforcement Learning | D4RL Halfcheetah Simultaneous (adversarial corruption) | Average Score19.72 | 8 | |
| Offline Inverse Reinforcement Learning | D4RL HalfCheetah Medium v2 | Cumulative Reward9,313.29 | 8 | |
| Offline Policy Evaluation | D4RL HalfCheetah medium | RMSE100.5 | 7 | |
| Offline Policy Evaluation | D4RL HalfCheetah medium-replay | RMSE46 | 7 | |
| Offline Reinforcement Learning | D4RL Halfcheetah random v0 | Return4,114 | 6 | |
| Continuous Control | D4RL Halfcheetah medium | Normalized Return40.7 | 5 | |
| Reinforcement Learning | D4RL HalfCheetah no thighs (medium) | Mean Return3,910 | 4 | |
| Reinforcement Learning | D4RL HalfCheetah broken back thigh medium | Mean Return5,761 | 4 | |
| Off-policy Evaluation | D4RL HalfCheetah medium-expert | RMAE7.8 | 3 | |
| Off-policy Evaluation | D4RL HalfCheetah medium | RMAE0.247 | 3 | |
| Reinforcement Learning | D4RL HalfCheetah Med-Expert | D4RL Score75.98 | 2 | |
| Reinforcement Learning | D4RL HalfCheetah Medium | D4RL Score42.16 | 2 |