Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Safe Reinforcement Learning on Hopper-Velocity

1,554.56Reward

RCPO

1,162.09521,263.98511,365.8751,467.7649Jan 26, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
1,554.5637.53979.3
2026.01
1,470.6110.671,000
2026.01
1,326.8426.9854.4
2026.01
1,322.8310.43910.2
2026.01
1,248.949.27818.1
2026.01
1,216.0127.57797.33
2026.01
1,177.1918.03797.57