Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reinforcement Learning on CartPole Pure

200Average Reward (2/0.5)

True env+MPC

27.04871.949116.85161.751Feb 13, 2021
Updated 1mo ago

Evaluation Results

MethodLinks
2021.02
200198.4200
2021.02
200198.4200
2021.02
200198.4200
2021.02
199.8199.1197.8
2021.02
199.7193.8192
2021.02
197.7193185.7
2021.02
171.1193.464.2
2021.02
168190.858.1
2021.02
166.7197.7189.4
2021.02
166.2181.2182.8
2021.02
163.7198.1190.4
2021.02
154.4190.9170.2
2021.02
150.5175.665.1
2021.02
122.8179.4193.6
2021.02
98.9162.186.1
2021.02
98.3198.6141.1
2021.02
92.3193.6127.5
2021.02
88.6196.1171
2021.02
67.430.67
2021.02
67.365.6140
2021.02
65.479.2132.1
2021.02
50.268.940.4
2021.02
44.857.9136.4
2021.02
43.921.139.5
2021.02
42.4199199.8
2021.02
41.4198.8199.9
2021.02
39.977.9148.7
2021.02
35.896.666.4
2021.02
35.827.510.6
2021.02
33.727.510.1