Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GridWorld

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringGridWorld (full)
Accuracy95
22
Environment UnlearningGridWorld
Reconstruction Rate (Pre-unlearn)100
18
Anomaly DetectionGridworld (test)
Mechanism Mean AUPR88.25
13
Anomaly DetectionGridworld newobj
AUROC98.22
13
Anomaly DetectionGridworld mechanism
AUROC84.59
13
POMDP AbstractionGridWorld 3 x 3
Validation Error0
10
Trajectory UnlearningGridWorld (test)
Pre-unlearn Score96
9
Trajectory UnlearningGridWorld
Unlearn Efficacy98
9
State Inference AttackGridWorld
Pre-unlearn Acc99
9
Proactive AssistanceGridworld
Speedup24.5
8
Planning8x8 two-room gridworld (test)
Validity (%)0.89
8
Reinforcement LearningGridWorld
Average Update Time (s)0.1419
7
NavigationGridworld
Avg Episode Return42.9
6
Flow MatchingGridWorld (test)
Flow Matching Loss1.33
5
Flow MatchingGridWorld (val)
Flow Matching Loss1.25
5
Inverse Transition LearningGridworld
Epsilon Matching78
5
Inverse Transition LearningGridworld 40% stochastic-policy states
Epsilon Matching Error0.37
5
Reinforcement LearningGridworld
Sample Complexity33
5
Deep Reinforcement LearningGridworld (test)
Usefulness74.2
4
Navigation ReasoningGridworld o.o.d 20x20 (out-of-domain)
Pass@191.5
4
Navigation ReasoningGridworld 10x10 (in-domain)
Pass@1100
4
Multi-Objective Constraint Inference5x5 Gridworld
CMSE0.027
3
Optimal policy searchGridWorld
Iterations (gamma=0.9)8
3
Generating CFMDPsGridWorld p = 0.4
Mean Execution Time (s)0.336
2
Generating CFMDPsGridWorld p = 0.9
Mean Execution Time (s)0.261
2
Showing 25 of 35 rows