Share your thoughts, 1 month free Claude Pro on usSee more

Safe Reinforcement Learning on AntButton (test)

0Violation Rate

TD3ADRC

Updated 4mo ago

Evaluation Results

Method	Links
TD3ADRC 2026.01		0	0	0.86
TRPOADRC 2026.01		0	0	4.15
CPPOADRC 2026.01		1	0	2.69
TRPOLag 2026.01		1	0	4.4
TRPOPID 2026.01		1	0	4.29
CPPOLag 2026.01		22	0.83	6.09
CPPOPID 2026.01		30	0.01	6.95
DDPGADRC 2026.01		193	0.12	4.87
TD3PID 2026.01		224	0.11	3.14
TD3Lag 2026.01		331	0.32	6.22
DDPGLag 2026.01		572	0.3	7.35
DDPGPID 2026.01		622	0.47	7.92