Scalable Kernel Inverse Optimization

About

Inverse Optimization (IO) is a framework for learning the unknown objective function of an expert decision-maker from a past dataset. In this paper, we extend the hypothesis class of IO objective functions to a reproducing kernel Hilbert space (RKHS), thereby enhancing feature representation to an infinite-dimensional space. We demonstrate that a variant of the representer theorem holds for a specific training loss, allowing the reformulation of the problem as a finite-dimensional convex optimization program. To address scalability issues commonly associated with kernel methods, we propose the Sequential Selection Optimization (SSO) algorithm to efficiently train the proposed Kernel Inverse Optimization (KIO) model. Finally, we validate the generalization capabilities of the proposed KIO model and the effectiveness of the SSO algorithm through learning-from-demonstration tasks on the MuJoCo benchmark.

Youyuan Long, Tolga Ok, Pedro Zattoni Scroccaro, Peyman Mohajerin Esfahani• 2024

Related benchmarks

Task	Dataset	Result
Offline Reinforcement Learning	D4RL halfcheetah-medium-expert	Normalized Score46.4	169
Offline Reinforcement Learning	D4RL hopper-medium-expert	Normalized Score79.6	161
Locomotion Control	D4RL walker2d-medium-expert	Normalized Return100.1	23
Continuous Control	D4RL Hopper medium	Normalized Return50.2	19
Continuous Control	D4RL Walker2d medium	Normalized Return74.6	14
Continuous Control	D4RL Hopper (expert)	Normalized Return109.9	5
Continuous Control	D4RL Walker2d expert	Normalized Return108.5	5
Continuous Control	D4RL Halfcheetah medium	Normalized Return39	5
Continuous Control	D4RL Halfcheetah-expert	Normalized Return84.4	5

Showing 9 of 9 rows

Other info

Code

Follow for update

@wizwand_team Discord