Share your thoughts, 1 month free Claude Pro on usSee more

Reinforcement Learning on LunarLander v3 (Average Agent Reward)

289Average Agent Reward

SAC - H-EARS

Updated 4mo ago

Evaluation Results

Method	Links
SAC - H-EARS 2026.03		289
TD3 - Vanilla 2026.03		279
TD3 - H-EARS 2026.03		277
SAC - Vanilla 2026.03		268
PPO - H-EARS 2026.03		258
DDPG - H-EARS 2026.03		250
POEM 2026.01		242.1
PPO - Vanilla 2026.03		235
DDPG - Vanilla 2026.03		231
PPO 2026.01		210.94
PDA 2026.03		204.7
PPO 2026.03		204.4
NPG 2026.03		34.1
TRPO 2026.03		-83