| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Locomotion | Half Cheetah IID (test) | Mean Episode Reward2,278 | 24 | |
| Locomotion Control | Half Cheetah sigma 0.3 (test) | Episode Reward1,577 | 24 | |
| Locomotion Control | Half Cheetah sigma 0.7 (test) | Reward462 | 18 | |
| Locomotion Control | Half Cheetah sigma 0.5 (test) | Episode Reward1,117 | 18 | |
| Locomotion Control | Half Cheetah sigma 0.1 (test) | Episode Reward2,278 | 18 | |
| Offline Meta-Reinforcement Learning | Half-Cheetah-Vel sampled 10 unseen (test) | Average Return-48.4 | 10 | |
| Reinforcement Learning | Half-cheetah-velocity (train) | Runtime (hours)2 | 7 | |
| Robotic Control | Half Cheetah | AP-3.58 | 6 | |
| Locomotion | Half-Cheetah Across-episode Reward and Dynamics Changes A-EP (R+D) (test) | Average Final Return-15.2 | 6 | |
| Locomotion | Half-Cheetah Across-episode Reward Changes A-EP (R) (test) | Avg Final Return-10.9 | 6 | |
| Locomotion | Half-Cheetah Within-episode Dynamics Changes W-EP (D) (test) | Average Final Return-48.2 | 6 | |
| Locomotion | Half-Cheetah Across-episode Agent Changes A-EP (A) (test) | Average Final Return-9.6 | 6 | |
| Locomotion | Half-Cheetah Across-episode Dynamics Changes A-EP (D) (test) | Avg Final Return-24.4 | 6 | |
| Imitation Learning | Half-Cheetah | Mean Score1,839.8 | 6 | |
| Locomotion | Half Cheetah sigma=0.7 | Reward482 | 6 | |
| Locomotion | Half Cheetah sigma=0.5 | Reward669 | 6 | |
| Locomotion | Half Cheetah sigma=0.1 | Reward834 | 6 | |
| Trajectory Optimization | Half Cheetah | Computational Time (s)26.4 | 5 | |
| Reinforcement Learning | half-cheetah 10-p | Normalized Return80.5 | 4 | |
| Reinforcement Learning | half-cheetah 2-p | Normalized Return44.9 | 4 | |
| Reinforcement Learning | half-cheetah 4-p | Normalized Return54.7 | 4 | |
| Long-horizon prediction | Half Cheetah | NLL-2.8 | 4 | |
| Locomotion | Half-Cheetah Continuous Dynamics Changes CONT (D) (test) | Avg Final Return-12.3 | 4 | |
| Locomotion | Half Cheetah | Metric- | 0 |