| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Inverted Double Pendulum | Inverted Double Pendulum | Convergence (%)100 | 20 | |
| Reinforcement Learning | Inverted Double Pendulum | Avg Episode Reward9,359.92 | 10 | |
| Continuous Control | Inverted Double Pendulum | Normalized AUC0.96 | 3 |