Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Constrained Reinforcement Learning on Grid
Loading...
276.3
Episodic Reward
PPO-L
174.172
200.686
227.2
253.714
Dec 11, 2025
Episodic Reward
Episodic Cost
Updated 4d ago
Evaluation Results
Method
Method
Links
Episodic Reward
Episodic Cost
PPO-L
Cost Threshold=75, Num...
2025.12
276.3
71.8
e-COP
Cost Threshold=75, Num...
2025.12
258.1
71.3
IPO
Cost Threshold=75, Num...
2025.12
229.4
74.2
PCPO
Cost Threshold=75, Num...
2025.12
226.5
72.6
FOCOPS
Cost Threshold=75, Num...
2025.12
215.4
76.6
P3O
Cost Threshold=75, Num...
2025.12
201.5
79.3
APPO
Cost Threshold=75, Num...
2025.12
184.4
79.5
CPO
Cost Threshold=75, Num...
2025.12
178.1
69.3
Feedback
Search any
task
Search any
task