Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Tabular Transformers for Modeling Multivariate Time Series

About

Tabular datasets are ubiquitous in data science applications. Given their importance, it seems natural to apply state-of-the-art deep learning algorithms in order to fully unlock their potential. Here we propose neural network models that represent tabular time series that can optionally leverage their hierarchical structure. This results in two architectures for tabular time series: one for learning representations that is analogous to BERT and can be pre-trained end-to-end and used in downstream tasks, and one that is akin to GPT and can be used for generation of realistic synthetic tabular sequences. We demonstrate our models on two datasets: a synthetic credit card transaction dataset, where the learned representations are used for fraud detection and synthetic data generation, and on a real pollution dataset, where the learned encodings are used to predict atmospheric pollutant concentrations. Code and data are available at https://github.com/IBM/TabFormer.

Inkit Padhi, Yair Schiff, Igor Melnyk, Mattia Rigotti, Youssef Mroueh, Pierre Dognin, Jerret Ross, Ravi Nair, Erik Altman• 2020

Related benchmarks

TaskDatasetResultRank
ClassificationInternal Bank Data (test)
AUC0.894
12
Multi-class classificationAge sberbank-sirius-lesson (test)
Accuracy60.1
12
Binary ClassificationGender 2019 (test)
AUC0.847
12
Age PredictionAge
Accuracy54
12
ClassificationRosbank
AUC0.798
12
ClassificationDataFusion
AUC0.648
10
Gender PredictionGender
AUC84.9
10
pMCI classificationADNI pMCI
Balanced Accuracy78.05
8
RegressionPrivate Dataset
MAE1.08e+4
6
Age ClassificationPrivate Dataset
Accuracy73.6
6
Showing 10 of 12 rows

Other info

Follow for update