Local Preferential Bayesian Optimization

About

Bayesian optimization (BO) is a popular and effective approach for tuning expensive, noisy experiments, but requires the formulation of an explicit objective function. Preferential BO (PBO) removes this requirement by learning from pairwise human feedback, yet existing methods struggle to efficiently optimize beyond low- and medium-dimensional problems due to their global search approaches. We address this limitation by developing a family of local PBO methods that transfer key ideas from high-dimensional BO to the preferential setting. In particular, we introduce local PBO methods which adapt trust-region and derivative-informed local search to pairwise preference feedback, where the latter exploits first- and second-order derivatives of the Laplace-approximated GP posterior. Our benchmark on GP sample paths, standard optimization benchmark functions, and policy-search tasks shows that local PBO methods are especially effective in high-dimensional and complex landscapes with steep optima. Compared with global preference-based baselines, they can substantially reduce cumulative regret, making them particularly useful for real-world preference-based optimization tasks such as policy search.

Johanna Menn, Miriam Kober, Paul Brunzema, David Stenger, Sebastian Trimpe• 2026

Related benchmarks

Task	Dataset	Result	Rank
Policy Search	HopperLinearPolicyTask 33D	f_best768		8
Policy Search	Walker2dLinearPolicyTask 102D	Best Objective Value (f_best)275		6

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord