Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Constrained Reinforcement Learning on Bottleneck
Loading...
388.1
Episodic Reward
CPO
213.38
258.74
304.1
349.46
Dec 11, 2025
Episodic Reward
Episodic Cost
Updated 4d ago
Evaluation Results
Method
Method
Links
Episodic Reward
Episodic Cost
CPO
Cost Threshold=50, Num...
2025.12
388.1
54.3
e-COP
Cost Threshold=50, Num...
2025.12
345.1
49.7
PPO-L
Cost Threshold=50, Num...
2025.12
298.3
41.4
P3O
Cost Threshold=50, Num...
2025.12
291.1
45.3
IPO
Cost Threshold=50, Num...
2025.12
279.3
48.2
PCPO
Cost Threshold=50, Num...
2025.12
264.2
49.8
FOCOPS
Cost Threshold=50, Num...
2025.12
251.3
46.6
APPO
Cost Threshold=50, Num...
2025.12
220.1
47.4
Feedback
Search any
task
Search any
task