| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Continuous Robot Control | HalfCheetah v3 (test) | Reward11,767 | 48 | |
| Reinforcement Learning | Halfcheetah v5 | Average Return13,996.2 | 43 | |
| Offline Reinforcement Learning | halfcheetah medium-replay | Normalized Score68.4 | 43 | |
| Offline Reinforcement Learning | halfcheetah medium | Normalized Score68.2 | 43 | |
| Offline Reinforcement Learning | Halfcheetah D4RL v2 (offline) | Average Score56 | 32 | |
| Offline Reinforcement Learning | halfcheetah Mixed Dataset | Normalized Reward74.5 | 24 | |
| Offline Reinforcement Learning | HalfCheetah Medium-Expert Gym-MuJoCo D4RL | Normalized Score95.1 | 18 | |
| Reinforcement Learning | HalfCheetah | Average Return7,223.53 | 17 | |
| Offline Reinforcement Learning | HalfCheetah kinematic shifts | Score79.6 | 16 | |
| Offline Reinforcement Learning | HalfCheetah Gym-MuJoCo Medium-Replay D4RL | Normalized Score48.9 | 16 | |
| Offline Reinforcement Learning | Halfcheetah | Average Return7,357.5 | 16 | |
| Offline Reinforcement Learning | HalfCheetah medium-expert | Normalized Score107.6 | 15 | |
| Reinforcement Learning | HalfCheetah v3 | Mean Reward17,177 | 15 | |
| Cross-Domain Offline Policy Adaptation | halfcheetah med Source Target | Normalized Score69.7 | 14 | |
| Offline Policy Adaptation | halfcheetah medium-expert | Normalized Score42.5 | 14 | |
| Offline Reinforcement Learning | HalfCheetah random | Normalized Score33.8 | 14 | |
| Continuous Control | HalfCheetah v5 | Normalized Mean Return1.05 | 12 | |
| Offline Reinforcement Learning | HalfCheetah Expert | Episodic Return1,002.09 | 12 | |
| Reinforcement Learning | HalfCheetah fixed linear adversary | Average Performance7,495 | 12 | |
| Worst-case time-constrained reinforcement learning | HalfCheetah MuJoCo (test) | Normalized Worst-Case Reward2.76 | 12 | |
| Robust Reinforcement Learning | HalfCheetah fixed exponential adversary MuJoCo | Avg Performance8,256 | 12 | |
| Continuous Control | HalfCheetah MuJoCo (test) | Worst-case Performance7,526 | 12 | |
| Robot Locomotion | HalfCheetah v1 (test) | Score4,273.31 | 12 | |
| Imitation Learning | HalfCheetah one-shot v2 | Normalized Score5.6 | 11 | |
| Reinforcement Learning | HalfCheetah (Pure-8-40) | Average Reward (0.3/1.7)7,459 | 10 |