| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Reinforcement Learning Control | Pendulum v1 | Mean Score1,378.78 | 40 | |
| Reinforcement Learning | Pendulum | Avg Episode Reward-145.49 | 26 | |
| State Estimation | Pendulum | MAE0.7 | 21 | |
| Reinforcement Learning | Pendulum v1 (test) | Average Return-164.82 | 16 | |
| Regression | Pendulum (test) | MSE0.0034 | 14 | |
| Continuous Control | Pendulum | Robustness Gap0.72 | 12 | |
| Rollout prediction | Pendulum | Rollout MSE1.05 | 12 | |
| Continuous Control | Pendulum | Median Samples5.6 | 12 | |
| Continuous Control | Pendulum v1 | Average Cumulative Reward-150.8 | 11 | |
| Regression | Pendulum | MSE3.32 | 11 | |
| Parameter Estimation | Pendulum 90cm | Length (m)1.07 | 9 | |
| Continuous Control | Pendulum Nonmarkov v1 (test) | AUC@T-556.9 | 9 | |
| Control | Pendulum v0 | Median Samples21 | 9 | |
| Transition model estimation | Pendulum discretized n = 10^5 | Failure Rate0 | 8 | |
| Image Interpolation | Pendulum (test) | MSE1 | 8 | |
| Reinforcement Learning | Pendulum classical control (1M steps) | Return-133.42 | 8 | |
| Dynamical Identification | Pendulum Numerical | AUC99 | 7 | |
| Robotic Control | Pendulum v1 | Local Optima Escape Rate89.2 | 7 | |
| Counterfactual Generation | Pendulum (test) | MAE (Pendulum (p) | do(p))0.013 | 6 | |
| Reinforcement Learning | Pendulum | Steps (Mean)80,240 | 6 | |
| Reinforcement Learning | Pendulum PD-C (test) | Cumulative Reward854 | 6 | |
| Continuous Control (Negative Reward) | Pendulum Pybullet | Mean Return9,124.6 | 6 | |
| Continuous Control (Positive Reward) | Pendulum Pybullet | Return9,043.3 | 6 | |
| Continuous Control (Negative Reward) | Pendulum Mujoco | Mean Return8,132.1 | 6 | |
| Continuous Control (Positive Reward) | Pendulum Mujoco | Return9,358.4 | 6 |