Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Offline-to-Online Reinforcement Learning on Adroit Average
Loading...
46.71
Average Online Return
DUAL
-2.6588
10.1581
22.975
35.7919
May 29, 2026
Average Online Return
Updated 2d ago
Evaluation Results
Method
Method
Links
Average Online Return
DUAL
Critic framework=IQL,...
2026.05
46.71
Diff-QL
Critic framework=IQL,...
2026.05
37.5125
EDIS
Critic framework=IQL,...
2026.05
33.9775
Base
Critic framework=IQL,...
2026.05
33.01
DUAL
Critic framework=Cal-Q...
2026.05
-0.2025
Diff-QL
Critic framework=Cal-Q...
2026.05
-0.545
EDIS
Critic framework=Cal-Q...
2026.05
-0.61
Base
Critic framework=Cal-Q...
2026.05
-0.76
Feedback
Search any
task
Search any
task