Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Policy Learning on CASdatasets Counterfactual (test)

-9.38V(pi)

Optimal

-10.4408-10.1654-9.89-9.6146Jan 27, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.01
-9.380.381.1
2026.01
-10.180.020.12
2026.01
-10.230.040.11
2026.01
-10.400.49
2026.01
-10.400.49