Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reinforcement Learning on LunarLander v3 (Average Agent Reward)

289Average Agent Reward

SAC - H-EARS

-97.882.56103203.44Jan 21, 2026Jan 29, 2026Feb 6, 2026Feb 15, 2026Feb 23, 2026Mar 3, 2026Mar 12, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
289
2026.03
279
2026.03
277
2026.03
268
2026.03
258
2026.03
250
2026.01
242.1
2026.03
235
2026.03
231
2026.01
210.94
2026.03
204.7
2026.03
204.4
2026.03
34.1
2026.03
-83