Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Point-Maze

Benchmarks

Task NameDataset NameSOTA ResultTrend
Inverse Reinforcement LearningPoint Maze
Normalized Performance1.03
6
Policy GeneralizationPoint Maze (test)
Average Return-5.21
6
Reward AdaptationPoint-Maze Shift (meta-test)
Average Return-5.37
4
Inverse Reinforcement LearningPoint Maze Flipped
Normalized Performance96
3
Policy GeneralizationPoint-Maze-Shift (meta-test)
Average Return-28.61
3
Showing 5 of 5 rows