Share your thoughts, 1 month free Claude Pro on usSee more

Safe Reinforcement Learning on AntCircle (test)

0Violation Rate

CPPOADRC

Updated 4mo ago

Evaluation Results

Method	Links
CPPOADRC 2026.01		0	0	3.23
TD3ADRC 2026.01		0.02	0	2.49
DDPGPID 2026.01		0.04	0	1.59
DDPGLag 2026.01		0.07	0	1.93
TRPOADRC 2026.01		0.14	0	5.74
DDPGADRC 2026.01		0.15	0	2.15
CPPOPID 2026.01		10.82	0.62	12.95
CPPOLag 2026.01		13.98	6.59	17.3
TRPOPID 2026.01		17.73	1.06	16.14
TRPOLag 2026.01		20.8	1.64	16.58
TD3PID 2026.01		21.32	3.61	13.74
TD3Lag 2026.01		31.26	1.76	15.02