| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Policy Evaluation | 400-State Random MDP on-policy | Sum of sqrt MSE24.74 | 7 | |
| Policy Evaluation | 400-State Random MDP (off-policy) | MSE0.11 | 7 | |
| Policy Evaluation | 400-State Random MDP on-policy | MSE0.07 | 7 | |
| Policy Evaluation | 400-State Random MDP off-policy | Sum of sqrt MSE29.65 | 6 |