Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Action-Value coverage estimation on RiverSwim mostly-right target policy T=50

0.523Q-Value Estimate (s=1, a=0)

Model-based bootstrap with percentile CI

-0.020920.120290.26150.40271May 12, 2026
Updated 21d ago

Evaluation Results

MethodLinks
2026.05
0.5230.5260.518
2026.05
0.5070.5080.483
2026.05
0.5030.4990.462
2026.05
0.4950.4970.525
2026.05
0.4870.4880.491
2026.05
0.480.4850.477
2026.05
0.4740.4730.484
2026.05
0.470.4650.447
2026.05
0.4650.4530.411
2026.05
0.4440.4490.452
2026.05
0.4340.4450.036
2026.05
0.4270.4060.422
2026.05
0.3320.3240.208
2026.05
0.3040.310.005
2026.05
0.2710.2240.232
2026.05
0.240.2370.077
2026.05
0.2290.2050.309
2026.05
0.1950.1880.119
2026.05
000
2026.05
000