Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Offline-to-online Reinforcement Learning on D4RL Hopper expert discretized
Loading...
47.1
Online Normalised Score
DRIFT
-1.572
11.064
23.7
36.336
May 12, 2026
Online Normalised Score
Offline Normalised Score
Updated 21d ago
Evaluation Results
Method
Method
Links
Online Normalised Score
Offline Normalised Score
DRIFT
discretisation=k-means...
2026.05
47.1
0.4
PEX
discretisation=k-means...
2026.05
39.5
0.1
CQL
discretisation=k-means...
2026.05
29.3
8.8
Cal-QL
discretisation=k-means...
2026.05
25.7
8.8
DQN
discretisation=k-means...
2026.05
21.7
0.3
IQL
discretisation=k-means...
2026.05
17.2
0.1
AWAC
discretisation=k-means...
2026.05
11.2
11.5
PPO
discretisation=k-means...
2026.05
3.7
-
SPA
discretisation=k-means...
2026.05
0.3
0.3
Feedback
Search any
task
Search any
task