Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Offline Reinforcement Learning on hopper medium

3,729Normalized Score

QDFM

-137.512866.2941,870.12,873.906Oct 10, 2023Mar 18, 2024Aug 25, 2024Feb 1, 2025Jul 11, 2025Dec 18, 2025May 27, 2026
Updated 5d ago

Evaluation Results

MethodLinks
2026.02
3,729
2026.02
3,618
2026.02
712
2026.02
154
2026.02
113
2026.02
111
2026.05
102.1
2026.02
100.6
2023.10
98.4
2026.05
98.1
2026.02
97.8
2023.10
97.2
97.1
2026.05
96.9
2026.05
96.9
2026.05
96.4
2023.10
94.9
2023.10
94.1
2026.05
92.5
2026.05
87.2
2026.02
86.1
2023.10
85.6
2026.02
82.9
2026.02
74.6
2023.10
74.3
2026.02
71.43
2026.02
71.08
2026.05
67.6
2026.02
66.68
2024.02
66.5
2023.10
66.3
2026.02
66.2
2026.05
63.8
2026.02
60.88
2024.02
60.3
2026.02
59.39
2026.05
59.3
2024.02
58
2024.02
57.3
2026.02
56.09
2026.02
55.9
2026.02
55.26
2023.10
54.5
2026.02
53
2023.10
52.1
2026.02
51.62
2024.02
50.7
2026.02
43.4
2025.06
40.61
2025.12
35.2
2025.06
32.99
2026.02
32.4
2025.12
31.6
2025.12
25.5
2025.12
24.2
2024.02
23.3
2025.12
20.6
2025.12
20.3
2025.12
19.3
2026.02
17.3
2026.02
15.3
2026.02
15.2
2025.06
13.05
2025.06
12.96
2025.06
12.69
2025.06
12.67
2026.02
12.4
2026.02
11.2