Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Offline-to-Online Reinforcement Learning on door-cloned v1
Loading...
15.26
Average Online Return
DUAL
-0.9536
3.2557
7.465
11.6743
May 29, 2026
Average Online Return
Updated 2d ago
Evaluation Results
Method
Method
Links
Average Online Return
DUAL
Critic framework=IQL,...
2026.05
15.26
Diff-QL
Critic framework=IQL,...
2026.05
11.55
EDIS
Critic framework=IQL,...
2026.05
10.46
Base
Critic framework=IQL,...
2026.05
9.79
DUAL
Critic framework=Cal-Q...
2026.05
-0.28
Diff-QL
Critic framework=Cal-Q...
2026.05
-0.31
EDIS
Critic framework=Cal-Q...
2026.05
-0.32
Base
Critic framework=Cal-Q...
2026.05
-0.33
Feedback
Search any
task
Search any
task