| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Reinforcement Learning | CartPole Pure | Average Reward (2/0.5)200 | 30 | |
| Reinforcement Learning | CartPole v1 (test) | Total Reward500 | 25 | |
| Continuous Control | Cartpole | Median Samples1.63 | 10 | |
| Reinforcement Learning | CartPole Pure-8-40 | Average Reward (EpLen=2, DF=0.5)200 | 10 | |
| Reinforcement Learning | CartPole | Average Reward989.5 | 9 | |
| Actuator Inversion | Cartpole (Ceval-in) | AER659 | 8 | |
| Actuator Inversion | Cartpole (train) | AER658 | 8 | |
| Control | Cartpole swing-up | Median Samples111 | 8 | |
| Reinforcement Learning | CartPole v0 | Mean Score199.2 | 8 | |
| Safe Reinforcement Learning | Safe CartPole | Training Time (s)68.7 | 7 | |
| Reinforcement Learning | Safe CartPole | Episode Reward200 | 7 | |
| Control Task | CartPole (test) | Average Reward494.09 | 6 | |
| Robustness Evaluation | CartPole A=9.5 (test) | Average Reward231.8 | 6 | |
| Robustness Evaluation | CartPole A=9.0 (test) | Average Reward830.9 | 6 | |
| Robustness Evaluation | CartPole A=8.5 (test) | Average Reward810.1 | 6 | |
| Robustness Evaluation | CartPole A=8.0 (test) | Average Reward1,000 | 6 | |
| Sensory-motor control | CartPole *2 | Reward (First Iter, Worst)9.8 | 5 | |
| Sensory-motor control | CartPole | Reward (First Iteration, Worst Rep)42.15 | 5 | |
| Reinforcement Learning | cartpole | Return200 | 5 | |
| Reinforcement Learning | CartPole v1 | Return354,122 | 5 | |
| Reinforcement Learning | CartPole 10% action noise (test) | Return (Noisy)279 | 4 | |
| Reinforcement Learning | CartPole Clean (test) | Clean Return354,122 | 4 | |
| Neural Network Verification | cartpole | Time (s)142 | 3 | |
| Reinforcement Learning | CartPole | Total Episodes2,000,000 | 3 | |
| Meta-Reinforcement Learning | Cartpole fl-ood | FLOPs (k)0.004 | 3 |