Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

A Non-asymptotic Approach to Best-Arm Identification for Gaussian Bandits

About

We propose a new strategy for best-arm identification with fixed confidence of Gaussian variables with bounded means and unit variance. This strategy, called Exploration-Biased Sampling, is not only asymptotically optimal: it is to the best of our knowledge the first strategy with non-asymptotic bounds that asymptotically matches the sample complexity.But the main advantage over other algorithms like Track-and-Stop is an improved behavior regarding exploration: Exploration-Biased Sampling is biased towards exploration in a subtle but natural way that makes it more stable and interpretable. These improvements are allowed by a new analysis of the sample complexity optimization problem, which yields a faster numerical resolution scheme and several quantitative regularity results that we believe of high independent interest.

Antoine Barrier, Aur\'elien Garivier, Tom\'a\v{s} Koc\'ak• 2021

Related benchmarks

TaskDatasetResultRank
Best Arm IdentificationRandom Gaussian instances K=10
Avg CPU Time (s)78.38
9
Showing 1 of 1 rows

Other info

Follow for update