Nearly-Optimal Algorithm for Adversarial Kernelized Bandits

About

This paper studies kernelized bandits (also known as Gaussian process bandits) in an adversarial environment, where the reward functions in a known reproducing kernel Hilbert space (RKHS) may be adversarially chosen at each round. We show that the exponential-weight algorithm achieves $\tilde{O}(\sqrt{T \gamma_T})$ adversarial regret, where $T$ and $\gamma_T$ denote the number of total rounds and the maximum information gain, respectively. For squared exponential (SE) and $\nu$-Mat\'ern kernels, we also show algorithm-independent lower bounds that guarantee the optimality of our algorithm up to polylogarithmic factors. Furthermore, we present a computationally efficient variant of our algorithm using Nystr\"om approximation while maintaining nearly optimal regret guarantees.

Shogo Iwazaki• 2026

Related benchmarks

Task	Dataset	Result
Bandit Optimization	Setting 1	Regret14.7	36
Bandit Optimization	Setting 2	Regret11.1	36
Bandit Optimization	Setting 3	Per-round Time Cost (s)0.2	28
Bandit Optimization	Setting 4	Per-round Time Cost (s)0.3	28
Kernel Bandit	Adversarial Kernel Bandit Compact input domain X subset of R^d Theoretical bounds	Regret (nu-Matern Kernel)2	4

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord