| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Reinforcement Learning | Swimmer | Average Returns260.35 | 24 | |
| Reinforcement Learning | Swimmer v3 | Mean Reward359.79 | 15 | |
| Reinforcement Learning | Swimmer v4 | Average Episodic Reward128.5 | 12 | |
| Locomotion | Swimmer | Relative Return Improvement3.59 | 10 | |
| Simulator Benchmark Optimization | Swimmer | Median Performance-339.62 | 9 | |
| Continuous Control | Swimmer v5 | Average Episodic Reward119.3 | 8 | |
| Imitation Learning from Observation | Swimmer v4 | AER0.73 | 8 | |
| Multi-objective Reinforcement Learning | Swimmer-2 | Hypervolume (HV)3.54 | 6 | |
| Reinforcement Learning | Swimmer-vel online downstream setting | Normalized Reward1.65 | 6 | |
| Safe Reinforcement Learning | Swimmer-vel (offline) | Normalized Reward2.39 | 6 | |
| Quality-Diversity | Swimmer | GT QD Score21.31 | 6 | |
| Continuous Control | Swimmer v2 (test) | Average Return354 | 6 | |
| Imitation Learning | Swimmer | Mean Score141.1 | 6 | |
| Reinforcement Learning | Swimmer (Gymnasium) | Mean Best Reward294.57 | 5 | |
| Continuous Control | Swimmer MuJoCo v5 | Max Return114 | 5 | |
| Reinforcement Learning | Swimmer | Mean Best Reward294.57 | 4 | |
| Safe Reinforcement Learning | Swimmer Velocity tasks suite | Average Reward36.32 | 4 | |
| Reinforcement Learning | Swimmer Velocity tasks suite (test) | Average Reward36.32 | 4 | |
| Robot Co-design | Swimmer search phase | D2C Score9.35 | 3 | |
| Continuous Control | Swimmer v5 | Terminal Performance22.3 | 2 | |
| Reinforcement Learning | Swimmer | Terminal Performance22.3 | 2 |