Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Offline-to-Online Reinforcement Learning on pen-cloned v1
Loading...
124.4
Avg Online Return
DUAL
-7.7736
26.5407
60.855
95.1693
May 29, 2026
Avg Online Return
Updated 2d ago
Evaluation Results
Method
Method
Links
Avg Online Return
DUAL
Critic framework=IQL,...
2026.05
124.4
Diff-QL
Critic framework=IQL,...
2026.05
104.45
EDIS
Critic framework=IQL,...
2026.05
97.31
Base
Critic framework=IQL,...
2026.05
94.74
DUAL
Critic framework=Cal-Q...
2026.05
-1.09
Diff-QL
Critic framework=Cal-Q...
2026.05
-1.93
EDIS
Critic framework=Cal-Q...
2026.05
-2.23
Base
Critic framework=Cal-Q...
2026.05
-2.69
Feedback
Search any
task
Search any
task