Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Policy learning from action-inclusive feedback on OpenML (K ≥ 3, N ≥ 70,000)

58.41Policy Accuracy

CB Policy

10.0522.60535.1647.715Jun 16, 2022
Updated 1mo ago

Evaluation Results

MethodLinks
2022.06
58.41--
2022.06
50.11-0.79
2022.06
11.91-0.22
-22.57-