Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Linear off-policy prediction on Baird environment
Loading...
2.21
Max RMSE
TD
1.452
6.5685
11.685
16.8015
May 2, 2026
Max RMSE
Divergence Count
Updated 27d ago
Evaluation Results
Method
Method
Links
Max RMSE
Divergence Count
TD
alpha=0.01, total runs=10
2026.05
2.21
10
ETD
alpha=0.01, total runs=50
2026.05
2.41
50
GTD2
alpha=0.01, total runs=10
2026.05
5.318
0
TETD
alpha=0.01, total runs=50
2026.05
5.44
50
TDRC
alpha=0.01, total runs=10
2026.05
11.35
0
TDC
alpha=0.01, total runs=10
2026.05
11.5
0
CETD
alpha=0.01, total runs=10
2026.05
17.54
0
RETD
alpha=0.01, total runs=10
2026.05
21.16
0
Feedback
Search any
task
Search any
task