| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Continuous Control | MuJoCo Walker2d v4 | Normalized Performance13,060 | 24 | |
| Offline Reinforcement Learning | MuJoCo walker2d medium-replay D4RL | Normalized Return94.1 | 20 | |
| Offline Reinforcement Learning | MuJoCo walker2d medium-expert D4RL | Normalized Return116.6 | 18 | |
| Reinforcement Learning | MuJoCo Walker2d v2 | Average Return8,004 | 18 | |
| Continuous Control | MuJoCo Walker2d (H=10) | Normalized Return14.9 | 10 | |
| Reinforcement Learning | MuJoCo Walker2d 1.5 density v1 (test) | Reward2,674 | 7 | |
| Continuous Control | MuJoCo Walker2d 10-p v4 | Normalized Return102 | 6 | |
| Continuous Control | MuJoCo Walker2d 4-p v4 | Normalized Return94.2 | 6 | |
| Continuous Control | MuJoCo Walker2d v2 (train) | Mean Return5,278 | 6 | |
| Reinforcement Learning | Sparse MuJoCo Walker2d v2 (test) | Max Return886.6 | 6 | |
| Reinforcement Learning | MuJoCo Walker2d epsilon=0.05 (test) | Natural Return4,875 | 5 | |
| Offline Inverse Reinforcement Learning | MuJoCo walker2d medium-exp | Average Reward5,383.98 | 5 | |
| Offline Inverse Reinforcement Learning | MuJoCo walker2d (medium-replay) | Avg Reward5,383.98 | 5 | |
| Offline Inverse Reinforcement Learning | MuJoCo walker2d medium | Avg Reward5,383.98 | 5 | |
| Continuous Control | MuJoCo Walker2d 1M steps v3 | Average Return5,099 | 5 | |
| Continuous Control | MuJoCo Walker2d v3 (500K steps) | Average Return4,034 | 5 | |
| Policy Optimization | MuJoCo Walker2d H=40 | Return221.1 | 5 | |
| Policy Optimization | MuJoCo Walker2d H=20 | Return60.7 | 5 | |
| Continuous Control | MuJoCo Walker2d (H=40) | Normalized Return221.1 | 5 | |
| Continuous Control | MuJoCo Walker2d (H=20) | Normalized Return60.7 | 5 | |
| Continuous Control | MuJoCo Walker2d v5 (test) | Average Return4,417 | 4 | |
| Off-dynamics Reinforcement Learning | MuJoCo Walker2d 0.5 density dynamics shift (test) | Reward2,729 | 4 | |
| Dynamics Model Prediction | MuJoCo Walker2d medium-replay v2 (test) | RMSE0.968 | 4 | |
| Dynamics Model Prediction | MuJoCo Walker2d expert v2 (test) | RMSE1.514 | 4 | |
| Dynamics Model Prediction | MuJoCo Walker2d medium v2 (train) | RMSE0.438 | 4 |