Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reinforcement Learning on MountainCarContinuous v0 (Average Agent Reward)

98.75Average Agent Reward

R2PO

-328.17-217.335-106.54.335Jun 20, 2024Oct 12, 2024Feb 4, 2025May 29, 2025Sep 21, 2025Jan 13, 2026May 8, 2026
Updated 22d ago

Evaluation Results

MethodLinks
2026.05
98.75-
2026.05
98.75-
2026.05
98.7-
2026.03
97-
2026.03
96-
2026.03
95-
2026.05
94.81-
2026.05
94.6-
2026.05
94.47-
2026.03
94-
2024.06
93.630.21
2024.06
93.630.21
2024.06
93.630.21
2024.06
93.630.21
2024.06
93.630.21
2024.06
93.630.21
2024.06
93.630.21
2024.06
93.630.21
2024.06
93.620.35
2026.01
93.52-
2024.06
93.390.51
2024.06
93.180.41
2024.06
93.150.81
2024.06
93.030.37
2024.06
92.820.89
2024.06
92.720.75
2024.06
92.670.85
2024.06
91.981.47
2026.05
89.16-
2024.06
89.14.78
2026.05
87.21-
2026.05
86.84-
2026.05
86.65-
2026.05
84.72-
2026.05
82.33-
2026.05
81.61-
2024.06
81.321.63
2024.06
81.12.02
2024.06
79.341.02
2026.05
78.16-
2024.06
76.311.74
2024.06
76.233.03
2026.05
75.37-
2024.06
74.27.2
2024.06
71.3513.15
2024.06
71.2512.27
2024.06
69.5810.57
2024.06
66.3612.27
2024.06
66.1920.54
2024.06
65.019.69
2024.06
63.1219.53
2024.06
62.4225.91
2024.06
62.325.59
2024.06
6225.12
2024.06
61.9421.83
2024.06
61.6417.19
2024.06
61.5228.84
2024.06
61.0916.84
2024.06
59.627.32
2024.06
57.1334.55
2026.05
23.45-
2026.05
17.9-
2026.03
4-
2026.03
-10-
2026.01
-311.75-