Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

A Closer Look at Deep Learning Methods on Tabular Datasets

About

Tabular data is prevalent across diverse domains in machine learning. With the rapid progress of deep tabular prediction methods, especially pretrained (foundation) models, there is a growing need to evaluate these methods systematically and to understand their behavior. We present an extensive study on TALENT, a collection of 300+ datasets spanning broad ranges of size, feature composition (numerical/categorical mixes), domains, and output types (binary, multi--class, regression). Our evaluation shows that ensembling benefits both tree-based and neural approaches. Traditional gradient-boosted trees remain very strong baselines, yet recent pretrained tabular models now match or surpass them on many tasks, narrowing--but not eliminating--the historical advantage of tree ensembles. Despite architectural diversity, top performance concentrates within a small subset of models, providing practical guidance for method selection. To explain these outcomes, we quantify dataset heterogeneity by learning from meta-features and early training dynamics to predict later validation behavior. This dynamics-aware analysis indicates that heterogeneity--such as the interplay of categorical and numerical attributes--largely determines which family of methods is favored. Finally, we introduce a two-level design beyond the 300 common-size datasets: a compact TALENT-tiny core (45 datasets) for rapid, reproducible evaluation, and a TALENT-extension suite targeting high-dimensional, many-class, and very large-scale settings for stress testing. In summary, these results offer actionable insights into the strengths, limitations, and future directions for improving deep tabular learning.

Han-Jia Ye, Si-Yang Liu, Hao-Run Cai, Qi-Le Zhou, De-Chuan Zhan• 2024

Related benchmarks

TaskDatasetResultRank
Binary ClassificationTALENT (test)
Top-1 Accuracy18.5
113
Multiclass ClassificationTabArena Lite
Elo Rating1.44e+3
48
Binary ClassificationTabArena
Elo Rating1.40e+3
48
RegressionTabArena Lite
Elo1.60e+3
48
Multiclass ClassificationTALENT
SGMε32.6
42
Multiclass ClassificationTALENT Multiclass (> 10 classes) Full (avg across datasets)
Rank6
31
RegressionTALENT 100 datasets
Rank7.79
28
ClassificationCovertype
Error Rate3.33
24
Classificationhiggs
ERR24.35
19
Binary ClassificationTALENT TabArena
Shifted Geometric Mean Error (SGME)0.1502
10
Showing 10 of 27 rows

Other info

Follow for update