Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Accountability in Offline Reinforcement Learning: Explaining Decisions with a Corpus of Examples

About

Learning controllers with offline data in decision-making systems is an essential area of research due to its potential to reduce the risk of applications in real-world systems. However, in responsibility-sensitive settings such as healthcare, decision accountability is of paramount importance, yet has not been adequately addressed by the literature. This paper introduces the Accountable Offline Controller (AOC) that employs the offline dataset as the Decision Corpus and performs accountable control based on a tailored selection of examples, referred to as the Corpus Subset. AOC operates effectively in low-data scenarios, can be extended to the strictly offline imitation setting, and displays qualities of both conservation and adaptability. We assess AOC's performance in both simulated and real-world healthcare scenarios, emphasizing its capability to manage offline control tasks with high levels of performance while maintaining accountability.

Hao Sun, Alihan H\"uy\"uk, Daniel Jarrett, Mihaela van der Schaar• 2023

Related benchmarks

TaskDatasetResultRank
Continuous ControlLunarLanderContinuous offline trajectories v2
Episodic Cumulative Reward253.7
35
Continuous ControlBipedalWalker v3
Episodic Cumulative Reward277
8
Offline ControlHeterogeneous Pendulum Low-Data 100,000 transition steps
Cumulative Reward-1.39
7
Offline ControlHeterogeneous Pendulum 300,000 transition steps (Mid-Data)
Cumulative Reward-1.25
7
Offline ControlHeterogeneous Pendulum Rich-Data 600,000 transition steps
Cumulative Reward-0.6
7
Showing 5 of 5 rows

Other info

Follow for update