| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Locomotion | D4RL Ant medium-offline | Normalized Score85.28 | 36 | |
| Offline Imitation Learning | D4RL Ant v2 (expert) | Normalized Score126.4 | 20 | |
| Offline Reinforcement Learning | D4RL ant medium v3 | Normalized Score98.9 | 7 | |
| Reinforcement Learning | D4RL Ant Medium | D4RL Score94.25 | 7 | |
| Locomotion | D4RL Ant Medium-Expert | Mean Return99.1 | 5 | |
| Locomotion | D4RL Ant Medium | Mean Return82.5 | 5 | |
| Locomotion | D4RL Ant Medium-Replay | Mean Return78.9 | 5 | |
| Reinforcement Learning | D4RL Ant Medium-Expert | Mean Normalized Return99.1 | 5 | |
| Reinforcement Learning | D4RL Ant Medium-Replay | Mean Normalized Return81.8 | 5 | |
| Reinforcement Learning | D4RL Ant (Random) | Mean Normalized Return83.4 | 5 | |
| Offline Reinforcement Learning | D4RL Ant Medium-Replay v2 | Normalized Score92.7 | 4 | |
| Offline Reinforcement Learning | D4RL Ant Medium-Expert v2 | Normalized Score136.2 | 4 | |
| Reinforcement Learning | D4RL Ant Med-Expert | D4RL Score125.47 | 2 | |
| Reinforcement Learning | D4RL Ant Med-Replay | D4RL Score89.39 | 2 |