Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Reinforcement Learning on Inverted Pendulum Swingup (Average Episode Reward)

892.9Avg Episode Reward

TRPO

-55.6944190.5753436.845683.1147Nov 2, 2023
Updated 4d ago

Evaluation Results

MethodLinks
2023.11
892.9
2023.11
892.25
2023.11
891.9
2023.11
891.48
2023.11
891.32
2023.11
890.36
2023.11
890.11
2023.11
890.1
2023.11
254.71
2023.11
-19.21