Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Offline Reinforcement Learning on Maze2D large v1

220.66Normalized Return

Inverter

-6.47652.492111.46170.428Dec 8, 2025Jan 4, 2026Feb 1, 2026Feb 28, 2026Mar 28, 2026Apr 24, 2026May 22, 2026
Updated 9d ago

Evaluation Results

MethodLinks
2026.05
220.660.21.5914.393.71
2026.05
209.138.191.28-1.0210.9
2026.05
204.761.191.8-1.4415.4
2026.05
123.0762.2214.32173.6839.3
2026.05
97.125.411.67-1.3414.3
2026.05
95.622.921.74-1.3914.9
2026.05
78.3361.770.21-1681.8
2026.05
61.723.51.69-1.3614.5
2025.12
37.7-----
2026.05
35.6628.23.57-2.8630.5
2025.12
35.6-----
2025.12
31.9-----
2025.12
30.5-----
2025.12
29.5-----
2025.12
28.3-----
2026.05
23.7536.71.84-1.4715.7
2025.12
23.4-----
2025.12
21.8-----
2025.12
20.5-----
2025.12
19.8-----
2025.12
14.3-----
2026.05
11.325.11.25-999.710.7
2025.12
10.9-----
2025.12
9.9-----
2025.12
9.9-----
2025.12
9.8-----
2025.12
7.6-----
2025.12
6.3-----
2025.12
4.9-----
2026.05
2.264.391.7-1.3614.6