| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Offline Policy Selection | Sepsis Simulator simulated (test) | AE0.006 | 18 | |
| Reward transfer | Sepsis Simulator (D1 fraction 0.2) | Regret0.0546 | 9 | |
| Value Estimation | Sepsis simulator | Bias0.001 | 8 | |
| Off-Policy Evaluation | Sepsis Simulator | RMSE0.013 | 6 |