| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Reinforcement Learning | InvDoublePend | Episodes Run2,000,000 | 3 | |
| Meta-Reinforcement Learning | InvDoublePend params | FLOPs (k)39 | 3 | |
| Reinforcement Learning | InvDoublePend standard (test) | Episode Length15 | 2 | |
| Interpretability Evaluation | InvDoublePend | Interpretability Score5 | 2 |