| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Reinforcement Learning | Acrobot v1 | Mean Return-140.16 | 14 | |
| Continuous Control | Acrobot Nonmarkov v1 (test) | AUC@T-82.7 | 9 | |
| Identifying Optimal Trajectories | Acrobot v1 (top-5 trajectories) | Average Trajectory Length73.2 | 6 | |
| Reinforcement Learning | Acrobot | Average Returns-86.2 | 5 | |
| Acrobot Control | Acrobot | Success Rate100 | 4 | |
| Reinforcement Learning | Acrobot standard (train/test) | Model Parameters850 | 4 | |
| Sensory-motor control | Acrobot Gymnasium | Mean Best Reward-77.3 | 2 | |
| Reinforcement Learning | acrobot Sticky | AUC@T-84,528,355.1 | 2 | |
| Reinforcement Learning | acrobot Noisy | AUC @ T-80,111,477.55 | 2 | |
| Reinforcement Learning | acrobot Clean | AUC@T-79,144,263.52 | 2 | |
| Continuous Control | Acrobot Nonmarkov v1 | AUC@T- | 0 |