Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reinforcement Learning on Pendulum v1
Loading...
-58.557
Reward
SAC-AdaGamma
-218.41852
-176.91601
-135.4135
-93.91099
May 7, 2026
Reward
Updated 26d ago
Evaluation Results
Method
Method
Links
Reward
SAC-AdaGamma
Adaptive-gamma=True
2026.05
-58.557
SAC
Adaptive-gamma=False
2026.05
-64.832
PPO-AdaGamma
Adaptive-gamma=True
2026.05
-198.12
PPO
Adaptive-gamma=False
2026.05
-212.27
Feedback
Search any
task
Search any
task