Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reinforcement Learning on LunarLander v3 (Average Agent Reward)
Loading...
289
Average Agent Reward
SAC - H-EARS
-97.88
2.56
103
203.44
Jan 21, 2026
Jan 29, 2026
Feb 6, 2026
Feb 15, 2026
Feb 23, 2026
Mar 3, 2026
Mar 12, 2026
Average Agent Reward
Updated 1mo ago
Evaluation Results
Method
Method
Links
Average Agent Reward
SAC - H-EARS
Algorithm Variant=H-EA...
2026.03
289
TD3 - Vanilla
Algorithm Variant=Vani...
2026.03
279
TD3 - H-EARS
Algorithm Variant=H-EA...
2026.03
277
SAC - Vanilla
Algorithm Variant=Vani...
2026.03
268
PPO - H-EARS
Algorithm Variant=H-EA...
2026.03
258
DDPG - H-EARS
Algorithm Variant=H-EA...
2026.03
250
POEM
Evaluation Episodes=15
2026.01
242.1
PPO - Vanilla
Algorithm Variant=Vani...
2026.03
235
DDPG - Vanilla
Algorithm Variant=Vani...
2026.03
231
PPO
Evaluation Episodes=15
2026.01
210.94
PDA
Environment steps=1M,...
2026.03
204.7
PPO
Environment steps=1M,...
2026.03
204.4
NPG
Environment steps=1M,...
2026.03
34.1
TRPO
Environment steps=1M,...
2026.03
-83
Feedback
Search any
task
Search any
task