Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Continuous Control on MountainCarContinuous v0
Loading...
99.39
Return
pi_ana
93.6908
95.1704
96.65
98.1296
May 21, 2026
Return
Reward Component r
Min Return
Max Return
Time to Completion (t*)
Policy Difference L2 Norm
Updated 12d ago
Evaluation Results
Method
Method
Links
Return
Reward Component r
Min Return
Max Return
Time to Completion (t*)
Policy Difference L2 Norm
pi_ana
Policy Type=Analytical...
2026.05
99.39
-
99.15
99.52
769
-
CH-3-ARS
Policy Type=Chebyshev...
2026.05
98.74
0.65
98.95
99.11
471
0.152
CH-3-REI
Policy Type=Chebyshev...
2026.05
98.62
0.77
98.31
98.89
396
0.068
CH-3-PPO
Policy Type=Chebyshev...
2026.05
98.1
1.29
97.61
98.42
469
0.087
ARS
Policy Type=Neural Pol...
2026.05
96.67
2.72
92.51
97.42
239
0.211
SAC
Policy Type=Neural Pol...
2026.05
94.61
4.78
89.7
95.77
106
0.317
PPO
Policy Type=Neural Pol...
2026.05
93.91
5.48
90.86
95.23
298
0.273
Feedback
Search any
task
Search any
task