Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Online Reinforcement Learning on FingerTurnHard DMControl (final)
Loading...
884.59
Normalized Return
GoRL(Diff)
623.81
691.5125
759.215
826.9175
Dec 2, 2025
Normalized Return
Updated 3mo ago
Evaluation Results
Method
Method
Links
Normalized Return
GoRL(Diff)
Seeds=5, Decoder=Diffu...
2025.12
884.59
GoRL(FM)
Seeds=5, Decoder=Flow-...
2025.12
860.83
FPO
Seeds=5
2025.12
752.08
PPO
Seeds=5
2025.12
738.7
DPPO
Seeds=5
2025.12
633.84
Feedback
Search any
task
Search any
task