Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GridWorld

Benchmarks

Task NameDataset NameSOTA ResultTrend
Environment UnlearningGridWorld
Reconstruction Rate (Pre-unlearn)100
18
Anomaly DetectionGridworld (test)
Mechanism Mean AUPR88.25
13
Anomaly DetectionGridworld newobj
AUROC98.22
13
Anomaly DetectionGridworld mechanism
AUROC84.59
13
POMDP AbstractionGridWorld 3 x 3
Validation Error0
10
Trajectory UnlearningGridWorld (test)
Pre-unlearn Score96
9
Trajectory UnlearningGridWorld
Unlearn Efficacy98
9
State Inference AttackGridWorld
Pre-unlearn Acc99
9
Planning8x8 two-room gridworld (test)
Validity (%)0.89
8
Reinforcement LearningGridWorld
Average Update Time (s)0.1419
7
NavigationGridworld
Avg Episode Return42.9
6
Reinforcement LearningGridworld
Sample Complexity33
5
Navigation ReasoningGridworld o.o.d 20x20 (out-of-domain)
Pass@191.5
4
Navigation ReasoningGridworld 10x10 (in-domain)
Pass@1100
4
Optimal policy searchGridWorld
Iterations (gamma=0.9)8
3
Generating CFMDPsGridWorld p = 0.4
Mean Execution Time (s)0.336
2
Generating CFMDPsGridWorld p = 0.9
Mean Execution Time (s)0.261
2
Counterfactual Policy EvaluationGridWorld p = 0.4 Catastrophic Path
Lowest Cumulative Reward-698
2
Counterfactual Policy EvaluationGridWorld (p = 0.4) - Almost Catastrophic
Lowest Cumulative Reward14
2
Counterfactual Policy EvaluationGridWorld p = 0.4 Slightly Suboptimal Path
Lowest Cumulative Reward19
2
Counterfactual Policy EvaluationGridWorld (p = 0.9) Catastrophic Path
Lowest Cumulative Reward-698
2
Counterfactual Policy EvaluationGridWorld (p = 0.9) - Almost Catastrophic
Cumulative Reward (Lowest)-697
2
Counterfactual Policy EvaluationGridWorld (p = 0.9) Slightly Suboptimal Path
Lowest Cumulative Reward-495
2
Counterfactual Policy EvaluationGridWorld p = 0.4
Worst-Case Counterfactual V(s0)230
2
Counterfactual Policy EvaluationGridWorld p = 0.9
Avg Worst-Case Counterfactual V(s0)346
2
Showing 25 of 27 rows