Share your thoughts, 1 month free Claude Pro on usSee more

Safe Reinforcement Learning on SafetyPointGoal2 v0

15.58Mean Reward

TRPO

Updated 12d ago

Evaluation Results

Method	Links
TRPO 2026.04		15.58	10.31	164.14	88.43
PPO 2026.04		13.26	14.05	167.46	87.06
TRPO-LAG 2026.04		2.37	8.46	89.04	187.67
PPO-LAG 2026.04		2.24	5.1	54.1	64.5
PPO-FAB 2026.04		-1.12	1.11	24.32	37.84