Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reinforcement Learning on Pusher
Loading...
39.88
Average Returns
DF-CWP-CP
-1,131.3472
-827.2786
-523.21
-219.1414
Jan 26, 2026
Feb 5, 2026
Feb 16, 2026
Feb 26, 2026
Mar 9, 2026
Mar 19, 2026
Mar 30, 2026
Average Returns
Updated 18d ago
Evaluation Results
Method
Method
Links
Average Returns
DF-CWP-CP
Number of training see...
2026.03
39.88
A2C
Number of training see...
2026.03
32.41
CG-FPD
Number of training see...
2026.03
27.23
PPO
Number of training see...
2026.03
25.5
SAC
Number of training see...
2026.03
25.5
SMAC
batch size=1000, seeds=5
2026.01
-408.2
AC-SGD
batch size=1000, seeds=5
2026.01
-433.6
AC-CG
batch size=1000, seeds=5
2026.01
-441
AC-Adam
batch size=1000, seeds=5
2026.01
-568.3
AC-KFAC
batch size=1000, seeds=5
2026.01
-1,086.3
Feedback
Search any
task
Search any
task