Auto-WEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms
About
Many different machine learning algorithms exist; taking into account each algorithm's hyperparameters, there is a staggeringly large number of possible alternatives overall. We consider the problem of simultaneously selecting a learning algorithm and setting its hyperparameters, going beyond previous work that addresses these issues in isolation. We show that this problem can be addressed by a fully automated approach, leveraging recent innovations in Bayesian optimization. Specifically, we consider a wide range of feature selection techniques (combining 3 search and 8 evaluator methods) and all classification approaches implemented in WEKA, spanning 2 ensemble methods, 10 meta-methods, 27 base classifiers, and hyperparameter settings for each classifier. On each of 21 popular datasets from the UCI repository, the KDD Cup 09, variants of the MNIST dataset and CIFAR-10, we show classification performance often much better than using standard selection/hyperparameter optimization methods. We hope that our approach will help non-expert users to more effectively identify machine learning algorithms and hyperparameter settings appropriate to their applications, and hence to achieve improved performance.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| AutoML | AutoML Benchmark 10-fold (train test) | Wins1 | 14 | |
| AutoML for Structured Data | Kaggle Benchmark 11 competitions | Failures1 | 12 | |
| Supervised Learning | AutoML Benchmark 2019 (test) | Count Better Than Original18 | 8 | |
| Tabular Classification | AutoML Benchmark | Wins6 | 7 | |
| Supervised Learning | 39 AutoML Benchmark (test) | Failures6 | 6 | |
| Tabular Classification | AutoML Benchmark (39 datasets) (test) | Wins4 | 6 |