Share your thoughts, 1 month free Claude Pro on usSee more

Online Reinforcement Learning on FingerTurnHard DMControl (final)

884.59Normalized Return

GoRL(Diff)

Updated 5mo ago

Evaluation Results

Method	Links
GoRL(Diff) 2025.12		884.59
GoRL(FM) 2025.12		860.83
FPO 2025.12		752.08
PPO 2025.12		738.7
DPPO 2025.12		633.84