Share your thoughts, 1 month free Claude Pro on usSee more

Offline Reinforcement Learning on D4RL Maze maze2d-umaze v2 (test)

13,330Normalized Score

A2PO

Updated 5mo ago

Evaluation Results

Method	Links
A2PO 2024.03		13,330
AWAC 2024.03		9,450
LAPO 2024.03		7,800
Diffusion-QL 2024.03		6,670
EQL 2024.03		5,650
IQL 2024.03		5,620
BCQ 2024.03		2,480
TD3+BC 2024.03		2,420
CQL+AW 2024.03		1,960
CQL 2024.03		570
BC 2024.03		50
MOPO 2024.03		-1,540