Share your thoughts, 1 month free Claude Pro on usSee more

Offline Reinforcement Learning on D4RL Walker-medium-expert v2

113Normalized Return

Onestep

Updated 1mo ago

Evaluation Results

Method	Links
Onestep 2022.02		113
SPOT 2022.02		112
VDT 2026.01		110.4
TD3+BC 2022.02		110.1
TD3+BC 2026.01		110.1
LSDT 2026.01		109.8
IQL 2022.02		109.6
IQL 2026.01		109.6
DDT 2026.01		109.5
CQL 2022.02		108.8
DT 2026.01		108.6
DT 2022.02		108.1
BC 2022.02		107.5
CQL 2026.01		98.7
BRAC-v 2026.01		81.6
AWAC 2022.02		74.5