Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Well-tuned Simple Nets Excel on Tabular Datasets

About

Tabular datasets are the last "unconquered castle" for deep learning, with traditional ML methods like Gradient-Boosted Decision Trees still performing strongly even against recent specialized neural architectures. In this paper, we hypothesize that the key to boosting the performance of neural networks lies in rethinking the joint and simultaneous application of a large set of modern regularization techniques. As a result, we propose regularizing plain Multilayer Perceptron (MLP) networks by searching for the optimal combination/cocktail of 13 regularization techniques for each dataset using a joint optimization over the decision on which regularizers to apply and their subsidiary hyperparameters. We empirically assess the impact of these regularization cocktails for MLPs in a large-scale empirical study comprising 40 tabular datasets and demonstrate that (i) well-regularized plain MLPs significantly outperform recent state-of-the-art specialized neural network architectures, and (ii) they even outperform strong traditional ML methods, such as XGBoost.

Arlind Kadra, Marius Lindauer, Frank Hutter, Josif Grabocka• 2021

Related benchmarks

TaskDatasetResultRank
Classificationnomao--
46
ClassificationCredit-g--
22
Classificationkc1
Balanced Accuracy74.381
18
Classificationaustralian--
18
ClassificationBank
Balanced Accuracy85.993
14
ClassificationShuttle
Balanced Accuracy0.9995
14
ClassificationArrhythmia
Balanced Accuracy61.461
9
ClassificationAdult
Balanced Accuracy82.443
8
Classificationnumerai
Balanced Accuracy52.668
8
Classificationanneal
Balanced Accuracy89.27
8
Showing 10 of 40 rows

Other info

Code

Follow for update