Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Reinforcement Learning on Safe CartPole
Loading...
200
Episode Reward
RPO-DDPG
8.12
57.935
107.75
157.565
Oct 14, 2023
Episode Reward
Max Inst. Ineq Violation
Max Inst. Eq Violation
Max Episode Ineq Violation
Max Episode Eq Violation
Updated 4d ago
Evaluation Results
Method
Method
Links
Episode Reward
Max Inst. Ineq Violation
Max Inst. Eq Violation
Max Episode Ineq Violation
Max Episode Eq Violation
RPO-DDPG
2023.10
200
0
0
0
0
RPO-SAC
2023.10
200
0
0
0
0
CUP
2023.10
63.9
0
0.5344
0.7904
0
Safety Layer
2023.10
53.8
0.0086
1.7099
0
13.6603
SAC-L
type=abridged version
2023.10
40.5
0
0.0487
0
0.1038
CPO
2023.10
20.1
0
0.0389
0
0.1096
DDPG-L
type=abridged version
2023.10
15.5
0
0.0408
0
0.118
Feedback
Search any
task
Search any
task