Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MDP

Benchmarks

Task NameDataset NameSOTA ResultTrend
Near-optimal policy identificationMDP
Metric-
0
Transfer Reinforcement LearningMDP with Tucker rank (d, d, d)
Metric-
0
Transfer Reinforcement LearningMDP with Tucker rank (S, S, d)
Metric-
0
Compute epsilon-optimal policyMDP sample setting
Metric-
0
Showing 4 of 4 rows