Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Automatic Termination for Hyperparameter Optimization

About

Bayesian optimization (BO) is a widely popular approach for the hyperparameter optimization (HPO) in machine learning. At its core, BO iteratively evaluates promising configurations until a user-defined budget, such as wall-clock time or number of iterations, is exhausted. While the final performance after tuning heavily depends on the provided budget, it is hard to pre-specify an optimal value in advance. In this work, we propose an effective and intuitive termination criterion for BO that automatically stops the procedure if it is sufficiently close to the global optimum. Our key insight is that the discrepancy between the true objective (predictive performance on test data) and the computable target (validation performance) suggests stopping once the suboptimality in optimizing the target is dominated by the statistical estimation error. Across an extensive range of real-world HPO problems and baselines, we show that our termination criterion achieves a better trade-off between the test performance and optimization time. Additionally, we find that overfitting may occur in the context of HPO, which is arguably an overlooked problem in the literature, and show how our termination criterion helps to mitigate this phenomenon on both small and large datasets.

Anastasia Makarova, Huibin Shen, Valerio Perrone, Aaron Klein, Jean Baptiste Faddoul, Andreas Krause, Matthias Seeger, Cedric Archambeau• 2021

Related benchmarks

TaskDatasetResultRank
Global OptimizationGP 10-6 D=4, T=128
Stopping Time18
12
Global OptimizationGP 10-2 (D=4, T=256)
Stopping Time224
12
Global OptimizationGP 10-6 (D=6, T=256)
Stopping Time31
6
Global OptimizationHartmann 6 (D=6, T=64)
Stopping Time26
6
Global OptimizationGP† 10-6 (D=2, T=64)
Stopping Time12
6
Global OptimizationHartmann 3 D=3, T=64
Stopping Time15
6
Global OptimizationCNN D=4, T=256
Stopping Time8
6
Global OptimizationBranin D=2, T=128
Stopping Time31
6
Global OptimizationRosenbrock D=4, T=96
Stopping Time68
6
Global OptimizationXGBoost D=3, T=128
Stopping Time16
6
Showing 10 of 12 rows

Other info

Follow for update