Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Transformed Distribution Matching for Missing Value Imputation

About

We study the problem of imputing missing values in a dataset, which has important applications in many domains. The key to missing value imputation is to capture the data distribution with incomplete samples and impute the missing values accordingly. In this paper, by leveraging the fact that any two batches of data with missing values come from the same data distribution, we propose to impute the missing values of two batches of samples by transforming them into a latent space through deep invertible functions and matching them distributionally. To learn the transformations and impute the missing values simultaneously, a simple and well-motivated algorithm is proposed. Our algorithm has fewer hyperparameters to fine-tune and generates high-quality imputations regardless of how missing values are generated. Extensive experiments over a large number of datasets and competing benchmark algorithms show that our method achieves state-of-the-art performance.

He Zhao, Ke Sun, Amir Dezfouli, Edwin Bonilla• 2023

Related benchmarks

TaskDatasetResultRank
Time Series ImputationETTm1
MSE1.003
110
Time Series ImputationETTh1
MSE0.991
86
Time Series ImputationETTm2
MSE0.998
83
Classification33 datasets missing rate <= 10% (test)
AUC86.64
65
Time Series ImputationExchange
MSE0.969
54
Classification10 Datasets Missing rate > 10% (test)
AUC80.06
50
Data ImputationNPHA
Accuracy66.29
30
Data ImputationGliomas
Accuracy79.8
30
Data ImputationCancer
Accuracy32.67
28
Data ImputationDiabetes (1/3 omitted)
Accuracy52.1
16
Showing 10 of 17 rows

Other info

Follow for update