Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Offline Reinforcement Learning on D4RL Adroit pen (human)

128.3Normalized Return

QQL

-8.25227.19962.6598.101Jan 30, 2023Aug 20, 2023Mar 9, 2024Sep 27, 2024Apr 17, 2025Nov 5, 2025May 27, 2026
Updated 6d ago

Evaluation Results

MethodLinks
2025.11
128.3
2025.11
122.1
2025.11
106.2
2025.11
105.3
2026.02
103.5
2025.12
103.2
2025.11
99.7
2024.02
83.9
2026.02
83.9
2026.02
81.8
2026.02
81.5
2025.12
78.5
2023.01
76.3
2026.02
76
2025.12
74.9
2024.02
74.2
2026.02
73.9
2025.12
72.8
2024.02
71.5
2025.12
71
2026.02
69
2026.02
68.5
2026.02
64
2024.02
62.9
2026.05
62.7
2025.11
58.9
2026.05
56.6
2023.01
53.8
2026.05
53.5
2023.01
53
2026.02
53
2025.12
52.6
2025.12
52.1
2026.05
49.8
2026.05
49.8
2026.05
47.5
2026.05
44.3
2023.01
44.2
2024.02
37.5
2026.02
37.5
2023.01
31.6
2025.12
30.1
2025.12
20.8
2026.05
14.9
2025.12
10.7
2025.11
10
2026.02
5.6
2026.05
2.2
2026.05
2.2
2026.05
1
2026.05
0.1
2026.05
0
2026.05
-3