Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Reinforcement Learning on LunarLander v2

2,292Final Return

Advantage-weighting

-173.216466.7921,106.81,746.808Oct 1, 2019Dec 13, 2019Feb 24, 2020May 8, 2020Jul 20, 2020Oct 1, 2020Dec 14, 2020
Updated 4d ago

Evaluation Results

MethodLinks
2020.12
2,292518,153
2020.12
278.23518,153
2020.12
272.14118.9
2020.12
266121,000,000
2020.12
262.1886.9
2020.12
258.877.6
2020.12
254.584,337.3
2020.12
248.221632,620.2
2019.10
229-
2020.12
225.791,295,307.1
2020.12
217.92210,733.2
2020.12
217.09647,691.1
2020.12
201.471,673
2020.12
201.4630,878.1
2020.12
201.4630,878.1
2020.12
200.65259,285.8
2020.12
200.3237,079.7
2020.12
200.2230,878.1
2019.10
185-
2020.12
132.83136.7
2019.10
121-
2019.10
104-
2020.12
-78.489
2020.12
-123.3439