Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Online Reinforcement Learning on WalkerWalk DMControl (final)
Loading...
919.61
Normalized Return
GoRL(FM)
-6.6244
233.8403
474.305
714.7697
Dec 2, 2025
Normalized Return
Updated 3mo ago
Evaluation Results
Method
Method
Links
Normalized Return
GoRL(FM)
Seeds=5, Decoder=Flow-...
2025.12
919.61
GoRL(Diff)
Seeds=5, Decoder=Diffu...
2025.12
908.96
PPO
Seeds=5
2025.12
825.65
DPPO
Seeds=5
2025.12
345.59
FPO
Seeds=5
2025.12
29
Feedback
Search any
task
Search any
task