| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Environment Unlearning | GridWorld | Reconstruction Rate (Pre-unlearn)100 | 18 | |
| Anomaly Detection | Gridworld (test) | Mechanism Mean AUPR88.25 | 13 | |
| Anomaly Detection | Gridworld newobj | AUROC98.22 | 13 | |
| Anomaly Detection | Gridworld mechanism | AUROC84.59 | 13 | |
| POMDP Abstraction | GridWorld 3 x 3 | Validation Error0 | 10 | |
| Trajectory Unlearning | GridWorld (test) | Pre-unlearn Score96 | 9 | |
| Trajectory Unlearning | GridWorld | Unlearn Efficacy98 | 9 | |
| State Inference Attack | GridWorld | Pre-unlearn Acc99 | 9 | |
| Planning | 8x8 two-room gridworld (test) | Validity (%)0.89 | 8 | |
| Reinforcement Learning | GridWorld | Average Update Time (s)0.1419 | 7 | |
| Navigation | Gridworld | Avg Episode Return42.9 | 6 | |
| Reinforcement Learning | Gridworld | Sample Complexity33 | 5 | |
| Navigation Reasoning | Gridworld o.o.d 20x20 (out-of-domain) | Pass@191.5 | 4 | |
| Navigation Reasoning | Gridworld 10x10 (in-domain) | Pass@1100 | 4 | |
| Optimal policy search | GridWorld | Iterations (gamma=0.9)8 | 3 | |
| Generating CFMDPs | GridWorld p = 0.4 | Mean Execution Time (s)0.336 | 2 | |
| Generating CFMDPs | GridWorld p = 0.9 | Mean Execution Time (s)0.261 | 2 | |
| Counterfactual Policy Evaluation | GridWorld p = 0.4 Catastrophic Path | Lowest Cumulative Reward-698 | 2 | |
| Counterfactual Policy Evaluation | GridWorld (p = 0.4) - Almost Catastrophic | Lowest Cumulative Reward14 | 2 | |
| Counterfactual Policy Evaluation | GridWorld p = 0.4 Slightly Suboptimal Path | Lowest Cumulative Reward19 | 2 | |
| Counterfactual Policy Evaluation | GridWorld (p = 0.9) Catastrophic Path | Lowest Cumulative Reward-698 | 2 | |
| Counterfactual Policy Evaluation | GridWorld (p = 0.9) - Almost Catastrophic | Cumulative Reward (Lowest)-697 | 2 | |
| Counterfactual Policy Evaluation | GridWorld (p = 0.9) Slightly Suboptimal Path | Lowest Cumulative Reward-495 | 2 | |
| Counterfactual Policy Evaluation | GridWorld p = 0.4 | Worst-Case Counterfactual V(s0)230 | 2 | |
| Counterfactual Policy Evaluation | GridWorld p = 0.9 | Avg Worst-Case Counterfactual V(s0)346 | 2 |