Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LunarLander

Benchmarks

Task NameDataset NameSOTA ResultTrend
Reinforcement LearningLunarLander v2
Final Return2,292
23
Reinforcement LearningLunarLander v3
Average Agent Reward289
14
Reinforcement LearningLunarLander
Average Episode Reward283.56
10
Continuous ControlLunarLander Nonmarkov v2 (test)
AUC@T107.6
9
Reinforcement LearningLunarLander v3
Coefficient of Variation3.2
8
Reinforcement LearningLunarLander v3
Episodes to Threshold (Score 200)290
8
Reinforcement LearningLunarLander classical control 1M steps
Return267.19
8
Offline Reinforcement LearningLunarLander Gym (100k)
Returns107
6
Offline Reinforcement LearningLunarLander Gym (10k)
Returns102
6
Offline Reinforcement LearningLunarLander Gym (1k)
Returns97
6
Reinforcement LearningLunarLander (LL) (test)
Average Undiscounted Reward241
6
Trajectory RankingLunarLander v2
Average Reward207.13
6
Reinforcement LearningLunarLander
Maximum Return260.6
5
Reinforcement LearningLunarLander
Environment Episodes400,000
3
Meta-Reinforcement LearningLunarlander g
FLOPs (k)0.015
3
Reinforcement Learninglunarlander Sticky
AUC@T36,783,880.67
2
Reinforcement Learninglunarlander Noisy
AUC @ T-25,766,227.01
2
Reinforcement Learninglunarlander Clean
AUC@T42,642,395.61
2
Reinforcement LearningLunarLander standard (test)
Episode Length16.5
2
Interpretability EvaluationLunarLander
Interpretability Score4
2
Stochastic Lipschitz OptimizationLunarLander
Regret7
1
Meta-Reinforcement LearningLunarLander
Metric-
0
Continuous ControlLunarLander Nonmarkov v2
AUC@T-
0
Showing 23 of 23 rows