Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AMLB: an AutoML Benchmark

About

Comparing different AutoML frameworks is notoriously challenging and often done incorrectly. We introduce an open and extensible benchmark that follows best practices and avoids common mistakes when comparing AutoML frameworks. We conduct a thorough comparison of 9 well-known AutoML frameworks across 71 classification and 33 regression tasks. The differences between the AutoML frameworks are explored with a multi-faceted analysis, evaluating model accuracy, its trade-offs with inference time, and framework failures. We also use Bradley-Terry trees to discover subsets of tasks where the relative AutoML framework rankings differ. The benchmark comes with an open-source tool that integrates with many AutoML frameworks and automates the empirical evaluation process end-to-end: from framework installation and resource allocation to in-depth evaluation. The benchmark uses public data sets, can be easily extended with other AutoML frameworks and tasks, and has a website with up-to-date results.

Pieter Gijsbers, Marcos L. P. Bueno, Stefan Coors, Erin LeDell, S\'ebastien Poirier, Janek Thomas, Bernd Bischl, Joaquin Vanschoren• 2022

Related benchmarks

TaskDatasetResultRank
Automated Machine Learning80 tasks all (test)
Average Test Rank6.19
19
Regression30 regression tasks (test)
Average Test Rank6.2
19
Classification50 classification tasks (test)
Average Test Rank6.18
19
Cell Painting morphology predictionBBBC036 SMILES-based
MSE3.4763
7
Cell Painting morphology predictionBBBC036 Plate-based
MSE2.5335
7
Cell Painting morphology predictionBBBC047 SMILES-based split
MSE2.8314
7
Cell Painting morphology predictionBBBC047 Plate-based
MSE2.6342
7
Cell Painting morphology predictionCPG0016 (SMILES-based split)
MSE1.3428
7
Cell Painting morphology predictionCPG0016 Plate-based
MSE1.1674
7
Showing 9 of 9 rows

Other info

Follow for update