Share your thoughts, 1 month free Claude Pro on usSee more

Reinforcement Learning on MountainCar (Pure)

-44.6Avg Reward (gamma=0.01)

CQL

Updated 5mo ago

Evaluation Results

Method	Links
CQL 2021.02		-44.6	-500	-499.3
MOREL 2021.02		-44.6	-500	-500
CQL 2021.02		-44.7	-500	-500
CQL 2021.02		-44.7	-500	-490.1
MOREL 2021.02		-44.7	-500	-492.1
BCQ 2021.02		-44.79	-380.7	-500
MOREL 2021.02		-46	-500	-500
BCQ 2021.02		-50.01	-373.6	-352
True env+MPC 2021.02		-53.95	-182.9	-197.5
True env+MPC 2021.02		-53.95	-182.9	-197.5
True env+MPC 2021.02		-53.95	-182.9	-197.5
PerSim 2021.02		-54.2	-191.2	-199.7
PerSim 2021.02		-54.6	-189.7	-200.3
Vanilla CaDM 2021.02		-55.23	-481.7	-496.2
Vanilla CaDM 2021.02		-56.73	-463.2	-478.9
PerSim 2021.02		-56.8	-189.4	-210.6
MOREL 2021.02		-61.3	-373.2	-428.5
CQL 2021.02		-61.4	-366.9	-429.1
BCQ 2021.02		-67.6	-267.8	-295.1
BCQ 2021.02		-71.21	-286.6	-328.3
PE-TS CaDM 2021.02		-74.23	-492.3	-500
CQL 2021.02		-79.4	-357.9	-407.6
MOREL 2021.02		-83.5	-357	-407.4
BCQ 2021.02		-94.87	-358.7	-486.5
PE-TS CaDM 2021.02		-102.3	-500	-500
Vanilla CaDM 2021.02		-106.3	-432.3	-471.8
PE-TS CaDM 2021.02		-107.6	-500	-500
CQL 2021.02		-176.1	-316.4	-362.9
BCQ 2021.02		-364.5	-260.6	-204.5
MOREL 2021.02		-373	-500	-500