| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Offline Reinforcement Learning | MuJoCo Hopper Friction shift | Normalized Score52.94 | 32 | |
| Reinforcement Learning | MuJoCo Hopper target planet (Venus-like gravity) | Reward2,670 | 25 | |
| Reinforcement Learning | MuJoCo Hopper v2 | Average Return4,408 | 18 | |
| Cross-Domain Offline Reinforcement Learning | MuJoCo Hopper mr/me to m/me/e v2 | Normalized Score76.5 | 12 | |
| Continuous Control | MuJoCo Hopper logarithmic adversary v1 | Average Performance Score2,577 | 12 | |
| Continuous Control | MuJoCo Hopper H=20 | Normalized Return33.3 | 10 | |
| Continuous Control | MuJoCo Hopper H=10 | Normalized Return13.2 | 10 | |
| Continuous control locomotion | MuJoCo Hopper v3 (train) | Best Episodic Return4,170.5 | 10 | |
| Locomotion | MuJoCo Hopper Friction shift | Normalized Return8.2 | 8 | |
| Locomotion | MuJoCo Hopper Kinematic shift | Normalized Return66.2 | 8 | |
| Locomotion | MuJoCo Hopper Morphology shift | Normalized Return63.5 | 8 | |
| Offline Reinforcement Learning | MuJoCo Hopper Medium-Replay v2 | Avg Normalized Score100.02 | 8 | |
| Offline Reinforcement Learning | MuJoCo Hopper Medium-Expert v2 | Avg Normalized Score107 | 7 | |
| Offline Reinforcement Learning | MuJoCo Hopper Medium v2 | Averaged Normalized Score102 | 7 | |
| Continuous Control | MuJoCo Hopper v4 (test) | Mean Episodic Return3,338 | 6 | |
| Continuous Control | MuJoCo Hopper 2-p v4 | Normalized Return106 | 6 | |
| Continuous Control | MuJoCo Hopper 4-p v4 | Normalized Return99 | 6 | |
| Continuous Control | MuJoCo Hopper v2 (train) | Mean Return3,713 | 6 | |
| Multi-objective Reinforcement Learning | MuJoCo Hopper 2 | Hypervolume (HV)22.09 | 5 | |
| Reinforcement Learning | MuJoCo Hopper epsilon=0.075 (test) | Natural Return3,684 | 5 | |
| Reinforcement Learning | MuJoCo Hopper v4 | Policy Return549 | 5 | |
| Offline Inverse Reinforcement Learning | MuJoCo hopper (medium-exp) | Average Reward3,512.09 | 5 | |
| Offline Inverse Reinforcement Learning | MuJoCo hopper (medium-replay) | Average Reward3,512.09 | 5 | |
| Offline Inverse Reinforcement Learning | MuJoCo hopper medium | Average Reward3,512.09 | 5 | |
| Continuous Control | MuJoCo Hopper v3 (1M steps) | Average Return3,687 | 5 |