| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Target Acquisition | Reacher Quadrant (test) | Target Hit Rate (per min)19.7 | 15 | |
| Target Acquisition | Reacher Linear (test) | Hit Rate (per minute)20 | 15 | |
| Target Acquisition | Reacher Continuous (test) | Hit Rate (per min)18.8 | 15 | |
| Reinforcement Learning | Reacher | Average Return-4.1 | 12 | |
| Reaching | Reacher 3D | Success Rate95.1 | 10 | |
| Continuous Control | reacher | Average Reward0.72 | 9 | |
| Actuator Inversion | Reacher H (eval-in) | AER291 | 8 | |
| Actuator Inversion | Reacher E (Ceval-in) | AER582 | 8 | |
| Actuator Inversion | Reacher H (train) | AER290 | 8 | |
| Actuator Inversion | Reacher E C (train) | AER584 | 8 | |
| Control | Reacher v2 | Median Samples251 | 8 | |
| Continuous Control | Reacher v2 | Average Return-4 | 7 | |
| Off-dynamics Reinforcement Learning | Reacher 0.5 density v1 (test) | Reward-11.7 | 7 | |
| Off-dynamics Reinforcement Learning | Reacher broken source environment MuJoCo | Average Reward30 | 7 | |
| Reinforcement Learning | Reacher 1.5 gravity MuJoCo | Reward-9.5 | 7 | |
| Reinforcement Learning | Reacher 0.5 gravity (test) | Average Return-7.1 | 7 | |
| Continuous Control | Reacher v1 (train) | Max Avg Return-3.6 | 7 | |
| Continuous Control (Negative Reward) | Reacher Pybullet | Mean Return16.8 | 6 | |
| Continuous Control (Positive Reward) | Reacher Pybullet | Mean Return18.7 | 6 | |
| Continuous Control (Negative Reward) | Reacher Mujoco | Mean Return-6.3 | 6 | |
| Continuous Control | fixedReacher | Average Reward0.849 | 6 | |
| Reinforcement Learning | Reacher | IQM Returns-11.69 | 4 | |
| Reinforcement Learning | Reacher target domain 1.5 density | Reward-7.1 | 4 | |
| Reinforcement Learning | Reacher target 0.5 gravity | Reward-7.2 | 4 | |
| Reinforcement Learning | Reacher 1.5 gravity MuJoCo (test) | Reward-8.3 | 4 |