| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Continual Reinforcement Learning | DMC6 (Continual) | Normalized Score1.04 | 24 | |
| Code Generation | DMC | PASS@173.4 | 8 | |
| Locomotion | DMC Dog-run (test) | Average Return232.6 | 8 | |
| Locomotion | DMC Dog-walk (test) | Average Return788.7 | 8 | |
| Locomotion | DMC Dog-stand (test) | Average Return879.2 | 8 | |
| Locomotion | DMC Dog-trot (test) | Average Return585.4 | 8 | |
| Locomotion | DMC Humanoid-walk (test) | Average Return226.5 | 8 | |
| Reinforcement Learning | DMC Walker-run | Normalized AUC728.49 | 8 | |
| Reinforcement Learning | DMC Quadruped-run | Normalized AUC717.24 | 8 | |
| Reinforcement Learning | DMC Hopper-hop | Normalized AUC313.78 | 8 | |
| Reinforcement Learning | DMC Cheetah-run | Normalized AUC721.85 | 8 | |
| Continuous Control (Offline RL) | DMC Cheetah Run → Cheetah Nopaw v1 (offline) | Mean Reward493 | 8 | |
| Continuous Control (Offline RL) | DMC Cheetah Run → Cheetah Uphill v1 (offline) | Mean Reward225 | 8 | |
| Continuous Control (Offline RL) | DMC Cheetah Run → Cheetah Downhill v1 (offline) | Mean Reward745 | 8 | |
| Continuous Control (Offline RL) | DMC Walker Walk → Walker Nofoot v1 (offline) | Mean Reward460 | 8 | |
| Continuous Control (Offline RL) | DMC Walker Walk → Walker Uphill v1 (offline) | Mean Reward407 | 8 | |
| Continuous Control (Offline RL) | DMC Walker Walk → Walker Downhill v1 (offline) | Mean Reward629 | 8 | |
| Ball in cup-Catch | DMC 500K v1 (test) | Episodic Reward460,380 | 7 | |
| Finger-Spin | DMC 500K v1 (test) | Episodic Reward884,128 | 7 | |
| Walker-Walk | DMC 500K v1 (train) | Episodic Reward91,718 | 7 | |
| Cheetah-Run | DMC 500K v1 (train) | Episodic Reward570,253 | 7 | |
| Ball in cup-Catch | DMC 100K v1 (test) | Episodic Reward862,167 | 7 | |
| Finger-Spin | DMC 100K v1 (test) | Episodic Reward693,141 | 7 | |
| Walker-Walk | DMC 100K v1 (train) | Episodic Reward510,151 | 7 | |
| Cheetah-Run | DMC 100K v1 (train) | Episodic Reward235,137 | 7 |