| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Offline Reinforcement Learning | D4RL halfcheetah-medium-expert | Normalized Score110 | 117 | |
| Offline Reinforcement Learning | D4RL hopper-medium-expert | Normalized Score119.2 | 115 | |
| Offline Reinforcement Learning | D4RL walker2d-medium-expert | Normalized Score114.2 | 86 | |
| Offline Reinforcement Learning | D4RL walker2d-random | Normalized Score510 | 77 | |
| Offline Reinforcement Learning | D4RL Medium-Replay Hopper | Normalized Score110.6 | 72 | |
| Offline Reinforcement Learning | D4RL halfcheetah-random | Normalized Score45.4 | 70 | |
| Offline Reinforcement Learning | D4RL hopper-random | Normalized Score53.6 | 62 | |
| Offline Reinforcement Learning | D4RL Medium-Replay HalfCheetah | Normalized Score77.6 | 59 | |
| Offline Reinforcement Learning | D4RL Medium HalfCheetah | Normalized Score84.3 | 59 | |
| Offline Reinforcement Learning | D4RL Medium Walker2d | Normalized Score106.4 | 58 | |
| Offline Reinforcement Learning | D4RL halfcheetah v2 (medium-replay) | Normalized Score76.9 | 58 | |
| Offline Reinforcement Learning | D4RL halfcheetah-expert v2 | Normalized Score106.8 | 56 | |
| Offline Reinforcement Learning | D4RL walker2d-expert v2 | Normalized Score115.9 | 56 | |
| Offline Reinforcement Learning | D4RL hopper-expert v2 | Normalized Score113 | 56 | |
| hopper locomotion | D4RL hopper medium-replay | Normalized Score105.12 | 56 | |
| Offline Reinforcement Learning | D4RL Hopper-medium-replay v2 | Normalized Return107.4 | 54 | |
| walker2d locomotion | D4RL walker2d medium-replay | Normalized Score106.2 | 53 | |
| Offline Reinforcement Learning | D4RL Gym walker2d (medium-replay) | Normalized Return109.7 | 52 | |
| Offline Reinforcement Learning | D4RL Hopper-medium-expert v2 | Normalized Return111.8 | 49 | |
| Locomotion | D4RL walker2d-medium-expert | Normalized Score121.4 | 47 | |
| Offline Reinforcement Learning | D4RL walker2d-medium-replay | Normalized Score99.3 | 45 | |
| Locomotion | D4RL walker2d-medium | Normalized Score88.1 | 44 | |
| Locomotion | D4RL halfcheetah-medium | Normalized Score63.5 | 44 | |
| Offline Reinforcement Learning | D4RL walker2d-medium-expert v2 | Normalized Score115.4 | 44 | |
| Offline Reinforcement Learning | D4RL antmaze-umaze (diverse) | Normalized Score93.5 | 40 |