Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Nearly-Optimal Algorithm for Adversarial Kernelized Bandits

About

This paper studies kernelized bandits (also known as Gaussian process bandits) in an adversarial environment, where the reward functions in a known reproducing kernel Hilbert space (RKHS) may be adversarially chosen at each round. We show that the exponential-weight algorithm achieves $\tilde{O}(\sqrt{T \gamma_T})$ adversarial regret, where $T$ and $\gamma_T$ denote the number of total rounds and the maximum information gain, respectively. For squared exponential (SE) and $\nu$-Mat\'ern kernels, we also show algorithm-independent lower bounds that guarantee the optimality of our algorithm up to polylogarithmic factors. Furthermore, we present a computationally efficient variant of our algorithm using Nystr\"om approximation while maintaining nearly optimal regret guarantees.

Shogo Iwazaki• 2026

Related benchmarks

TaskDatasetResultRank
Bandit OptimizationSetting 1
Regret14.7
36
Bandit OptimizationSetting 2
Regret11.1
36
Bandit OptimizationSetting 3
Per-round Time Cost (s)0.2
28
Bandit OptimizationSetting 4
Per-round Time Cost (s)0.3
28
Kernel BanditAdversarial Kernel Bandit Compact input domain X subset of R^d Theoretical bounds
Regret (nu-Matern Kernel)2
4
Showing 5 of 5 rows

Other info

Follow for update