Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

c-TPE: Tree-structured Parzen Estimator with Inequality Constraints for Expensive Hyperparameter Optimization

About

Hyperparameter optimization (HPO) is crucial for strong performance of deep learning algorithms and real-world applications often impose some constraints, such as on memory usage or latency, on top of the performance requirement. In this work, we propose constrained TPE (c-TPE), an extension of the widely-used versatile Bayesian optimization method, tree-structured Parzen estimator (TPE), to handle these constraints. Our proposed extension goes beyond a simple combination of an existing acquisition function and the original TPE, and instead includes modifications that address issues that cause poor performance. We thoroughly analyze these modifications both empirically and theoretically, providing insights into how they effectively overcome these challenges. In the experiments, we demonstrate that c-TPE exhibits the best average rank performance among existing methods with statistical significance on $81$ expensive HPO problems with inequality constraints. Due to the lack of baselines, we only discuss the applicability of our method to hard-constrained optimization in Appendix D. The implementation is now available via OptunaHub.

Shuhei Watanabe, Frank Hutter• 2022

Related benchmarks

TaskDatasetResultRank
Constrained Hyperparameter Optimization9 Tabular HPO Benchmarks HPOlib, NAS-Bench-101, NAS-Bench-201
Wins27
72
Black-box OptimizationCrashy Branin
Best Objective Value1.15
28
Constrained Black-box Optimization9 Tabular Benchmarks Constraint: Network size
Wins80
24
Constrained Black-box Optimization9 Tabular Benchmarks Constraint: Runtime & Network size
Wins79
24
Constrained Black-box Optimization9 Tabular Benchmarks Constraint: Runtime
Wins71
24
Model DiscoveryA100
vit_tiny Discovery Rate4
4
Model DiscoveryL4
vit_tiny Discovery Rate3
4
Model DiscoveryT4
Discovery Rate (vit_tiny)3
4
Neural Architecture SearchSearch Space H100 constraints (test)
Mean Accuracy75.8
4
Neural Architecture SearchSearch Space A100 constraints (test)
Accuracy76.2
4
Showing 10 of 17 rows

Other info

Follow for update