Share your thoughts, 1 month free Claude Pro on usSee more

Reinforcement Learning on Inverted Pendulum Swingup (Average Episode Reward)

892.9Avg Episode Reward

TRPO

Updated 1mo ago

Evaluation Results

Method	Links
TRPO 2023.11		892.9
TD3 2023.11		892.25
DSP 2023.11		891.9
DDPG 2023.11		891.48
SAC 2023.11		891.32
ESPL 2023.11		890.36
ACKTR 2023.11		890.11
PPO 2023.11		890.1
A2C 2023.11		254.71
Regression 2023.11		-19.21