| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Reinforcement Learning | BipedalWalker | Average Episode Reward314.24 | 20 | |
| Continuous Control | BipedalWalker v3 | Episodic Cumulative Reward298.4 | 15 | |
| Locomotion | BipedalWalker Overall Mean | Mean Return89.95 | 11 | |
| Locomotion | BipedalWalker Roughness terrain | Mean Return224.4 | 11 | |
| Locomotion | BipedalWalker Stump terrain | Mean Return34.16 | 11 | |
| Locomotion | BipedalWalker PitGap terrain | Mean Return-7.65 | 11 | |
| Locomotion | BipedalWalker Stairs terrain | Mean Return-0.66 | 11 | |
| Locomotion | BipedalWalker Hardcore terrain | Mean Return86.83 | 11 | |
| Locomotion | BipedalWalker Basic terrain | Mean Return293.67 | 11 | |
| Solved Rate | BipedalWalker Zero-Shot (test) | Basic Solved Rate100 | 9 | |
| Reinforcement Learning | BipedalWalker | Training Time (h)18.38 | 9 | |
| Continuous Control | BipedalWalker Nonmarkov v3 | AUC@T184.7 | 9 | |
| Robotic Control | BipedalWalker v3 | Local Optima Escape Rate83.5 | 7 | |
| Environment Interaction | BipedalWalker | Environment Steps (M)347 | 7 | |
| Adaptability Evaluation | BipedalWalker mass variations (test) | AUC295.65 | 6 | |
| Adaptability Evaluation | BipedalWalker friction variations (test) | AUC1,429.66 | 6 | |
| Quality-Diversity | BipedalWalker | GT QD Score6.09 | 6 | |
| Reinforcement Learning | BipedalWalker v3 | Return273.2 | 6 | |
| Reinforcement Learning | bipedalwalker Sticky | AUC@T42,687,915.83 | 2 | |
| Reinforcement Learning | bipedalwalker Noisy | AUC@T32,301,685.83 | 2 | |
| Reinforcement Learning | bipedalwalker (Clean) | AUC@T9,392,950.11 | 2 | |
| Reinforcement Learning | BipedalWalker standard (test) | Length17 | 2 | |
| Interpretability Evaluation | BipedalWalker | Interpretability Score3.2 | 2 |