Share your thoughts, 1 month free Claude Pro on usSee more

Online Reinforcement Learning on CheetahRun DMControl (final)

902.24Normalized Return

GoRL(Diff)

Updated 5mo ago

Evaluation Results

Method	Links
GoRL(Diff) 2025.12		902.24
GoRL(FM) 2025.12		883.4
PPO 2025.12		724.83
FPO 2025.12		599.15
DPPO 2025.12		559.79