Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Interaction-Grounded Learning with Action-inclusive Feedback

About

Consider the problem setting of Interaction-Grounded Learning (IGL), in which a learner's goal is to optimally interact with the environment with no explicit reward to ground its policies. The agent observes a context vector, takes an action, and receives a feedback vector, using this information to effectively optimize a policy with respect to a latent reward function. Prior analyzed approaches fail when the feedback vector contains the action, which significantly limits IGL's success in many potential scenarios such as Brain-computer interface (BCI) or Human-computer interface (HCI) applications. We address this by creating an algorithm and analysis which allows IGL to work even when the feedback vector contains the action, encoded in any fashion. We provide theoretical guarantees and large-scale experiments based on supervised datasets to demonstrate the effectiveness of the new approach.

Tengyang Xie, Akanksha Saran, Dylan J. Foster, Lekan Molu, Ida Momennejad, Nan Jiang, Paul Mineiro, John Langford• 2022

Related benchmarks

TaskDatasetResultRank
Policy learning from action-inclusive feedbackOpenML K ≥ 3
Policy Accuracy35.74
3
Policy learning from action-inclusive feedbackOpenML (K ≥ 3, N ≥ 70,000)
Policy Accuracy50.11
3
Interaction-Grounded LearningSimulated BCI Action-Inclusive Feedback, 1% noise (test)
Policy Accuracy89.1
2
Interaction-Grounded LearningSimulated BCI Action-Inclusive Feedback, 5% noise (test)
Policy Accuracy76.6
2
Interaction-Grounded LearningSimulated BCI Action-Inclusive Feedback, 10% noise (test)
Policy Accuracy64.25
2
Showing 5 of 5 rows

Other info

Follow for update