| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Reinforcement Learning | MuJoCo Ant v2 | Average Return10,133 | 18 | |
| Continuous Control | MuJoCo Ant fixed random adversary L=0.1 | Average Performance8,041 | 12 | |
| Reinforcement Learning | MuJoCo Ant (test) | Average Reward7,586 | 12 | |
| Continuous Control | MuJoCo Ant logarithmic adversary v1 | Avg Performance8,061 | 12 | |
| Reinforcement Learning | MuJoCo Ant 1.5 density v1 (test) | Reward5,193 | 7 | |
| Continuous Control | MuJoCo Ant 10-p v4 | Normalized Return92.7 | 6 | |
| Continuous Control | MuJoCo Ant 2-p v4 | Normalized Return146.1 | 6 | |
| Continuous Control | MuJoCo Ant v2 (train) | Mean Return4,796 | 6 | |
| Reinforcement Learning | MuJoCo Ant epsilon=0.15 (test) | Natural Return5,381 | 5 | |
| Continuous Control | MuJoCo Ant 1M steps v3 | Average Return5,930 | 5 | |
| Continuous Control | MuJoCo Ant v3 (500K steps) | Average Return5,009 | 5 | |
| Continuous Control | MuJoCo Ant v5 (test) | Average Return5,867 | 4 | |
| Off-dynamics Reinforcement Learning | MuJoCo Ant 0.5 density dynamics shift (test) | Reward3,798 | 4 | |
| Inverse Reinforcement Learning | MuJoCo Ant (test) | Average Performance5,783 | 4 | |
| Meta-Reinforcement Learning | MuJoCo Ant Body variation (test) | CVaR 0.05 Return1,368 | 2 | |
| Meta-Reinforcement Learning | MuJoCo Ant Mass variation (test) | CVaR 0.05 Return1,385 | 2 | |
| Meta-Reinforcement Learning | MuJoCo Ant Goal variation (test) | CVaR 0.05 Return-454 | 2 | |
| Continuous control locomotion | MuJoCo Ant v3 (train) | Avg Performance (1M Steps)762 | 2 |