Share your thoughts, 1 month free Claude Pro on usSee more

Offline-to-Online Reinforcement Learning on pen-cloned v1

124.4Avg Online Return

DUAL

Updated 1mo ago

Evaluation Results

Method	Links
DUAL 2026.05		124.4
Diff-QL 2026.05		104.45
EDIS 2026.05		97.31
Base 2026.05		94.74
DUAL 2026.05		-1.09
Diff-QL 2026.05		-1.93
EDIS 2026.05		-2.23
Base 2026.05		-2.69