Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reinforcement Learning on Acrobot v1

89.37Mean Return

π-PRL

-8,337.0948-6,149.4549-3,961.815-1,774.1751Dec 18, 2025Jan 12, 2026Feb 6, 2026Mar 3, 2026Mar 28, 2026Apr 22, 2026May 18, 2026
Updated 15d ago

Evaluation Results

MethodLinks
2026.05
89.37---
2026.05
86.63---
2026.05
80.31---
2026.05
79.93---
2026.05
78.98---
2026.05
67.83---
2026.05
62.57---
2026.05
-82.61---
2026.05
-82.91---
2026.05
-95.34---
2026.05
-107.18--0.46
2026.05
-115.186---
2026.05
-125.71--0.54
2025.12
-140.1688.52--
2026.02
-147.1671.06--
2026.05
-148.58--0.64
2025.12
-149.5985.72--
2026.05
-150.1--0.64
2026.05
-154.09--0.66
2025.12
-154.8285.56--
2026.02
-156.982.4--
2026.05
-162.14--0.69
2026.02
-164.6984.03--
2025.12
-166.3297.39--
-172.6106.6--
2026.05
-196.63--0.84
2026.05
-218.71--0.94
2026.05
-233.44--1
2026.05
-245.62--1.05
2025.12
-264.58130.77--
2026.05
-273.76--1.17
2026.05
-304.7--1.31
2026.05
-309.2--1.32
2026.02
-49819.9--
2026.05
-498.16--2.13
2026.05
-498.54--2.14
-498.94--2.14
2026.05
-499.02--2.14
2025.12
-5,000-0-
2025.12
-7,518-85-
2025.12
-7,715-90-
2025.12
-8,013-95-