| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Online Reinforcement Learning | OpenAI Gym MuJoCo Normalized v4 | Normalized Mean Return95.5 | 50 | |
| Continuous Control | OpenAI Gym Mujoco 200K steps v2 (train) | InvertedPendulum-v2 Return1,000 | 5 | |
| Continuous Control | OpenAI Gym Mujoco 100K steps v2 (train) | InvertedPendulum-v2 Score1,000 | 5 |