Share your thoughts, 1 month free Claude Pro on usSee more

Safe Reinforcement Learning on Circle (offline)

1.27Normalized Reward

Task-Only

Updated 12d ago

Evaluation Results

Method	Links
Task-Only 2026.05		1.27	1
RC 2026.05		1.09	0
SOPL 2026.05		1.04	0
Oracle 2026.05		1	0
Safe-CPL 2026.05		0.91	0
Safe-VPL 2026.05		0.9	0