Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LunarLander

Benchmarks

Task NameDataset NameSOTA ResultTrend
Reinforcement LearningLunarLander v2
Final Return2,292
30
Reinforcement LearningLunarLander
Average Episode Reward283.56
15
Reinforcement LearningLunarLander v3
Average Agent Reward289
14
Continuous-state and discrete-action controlLunarLander v3
Average Reward232.9
13
Reinforcement LearningLunarLander
Training Time (min)0.8
13
ControlLunarLander
Robustness Gap0.18
12
Black-box OptimizationLunarLander frozen noise v3
Total Proposal Time (s)0
9
Continuous ControlLunarLander Nonmarkov v2 (test)
AUC@T107.6
9
Black-box OptimizationLunarLander natural noise v3
Total Proposal Time (s)0.3
8
Reinforcement LearningLunarLander v3
Coefficient of Variation3.2
8
Reinforcement LearningLunarLander v3
Episodes to Threshold (Score 200)290
8
Reinforcement LearningLunarLander classical control 1M steps
Return267.19
8
Offline Reinforcement LearningLunarLander Gym (100k)
Returns107
6
Offline Reinforcement LearningLunarLander Gym (10k)
Returns102
6
Offline Reinforcement LearningLunarLander Gym (1k)
Returns97
6
Reinforcement LearningLunarLander (LL) (test)
Average Undiscounted Reward241
6
Trajectory RankingLunarLander v2
Average Reward207.13
6
Reinforcement LearningLunarLander
Maximum Return260.6
5
Reinforcement LearningLunarLander
Return166.3
3
Reinforcement LearningLunarLander
Environment Episodes400,000
3
Meta-Reinforcement LearningLunarlander g
FLOPs (k)0.015
3
Reinforcement LearningLunarLander v2
Episodes to Target Reward750
2
Reinforcement Learninglunarlander Sticky
AUC@T36,783,880.67
2
Reinforcement Learninglunarlander Noisy
AUC @ T-25,766,227.01
2
Reinforcement Learninglunarlander Clean
AUC@T42,642,395.61
2
Showing 25 of 30 rows