Share your thoughts, 1 month free Claude Pro on usSee more

Constrained Offline Reinforcement Learning on DSRL Walker2dVelocity

0.8Normalized Return

MPDiffuser

Updated 5mo ago

Evaluation Results

Method	Links
MPDiffuser 2025.12		0.8	0.27
BC-All 2025.12		0.79	3.88
BC-Safe 2025.12		0.79	0.04
BCQ-Lag 2025.12		0.79	0.17
CDT 2025.12		0.78	0.06
COptiDICE 2025.12		0.12	0.74
CPQ 2025.12		0.04	0.21