| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Time Series Forecasting | Cheetah MuJoCo D (test) | MSE0.65 | 26 | |
| Reinforcement Learning | Cheetah | Return934.76 | 24 | |
| Meta-Reinforcement Learning | Cheetah-Vel-Sparse (OOD) | Average Return296 | 15 | |
| Locomotion | Cheetah-Vel-E (c^r) | Average Return-38.3 | 14 | |
| Reinforcement Learning | Cheetah Task 2 | Return21,100 | 12 | |
| Reinforcement Learning | Cheetah Task 1 | Return15.4 | 12 | |
| Continuous Control | cheetah | Average Reward934.76 | 12 | |
| Meta-Reinforcement Learning | Cheetah-Vel | Average Return60 | 10 | |
| Locomotion Control | Cheetah-Dir-E (cr) | Average Return962.1 | 8 | |
| Actuator Inversion | Cheetah Ceval-in (eval-in) | AER319 | 8 | |
| Actuator Inversion | Cheetah (train) | AER319 | 8 | |
| Contact State Detection | Mini-Cheetah real-world | F1 (Leg LF)93.6 | 6 | |
| Single-life task completion | Cheetah | Average Steps74,300 | 5 | |
| Locomotion Control (Cheetah) | Cheetah 3000 episodes | Return (IQM)280.97 | 4 | |
| Meta-Reinforcement Learning | Cheetah vel-ood | FLOPs (k)0.53 | 3 | |
| Continuous Control | hardCheetah | Average Reward1.311 | 3 | |
| Latent space prediction | Cheetah | MSE0.0003 | 2 | |
| Reinforcement Learning | Cheetah | Zero-shot Reward10,941,130 | 1 |