Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MIRACLE: Causally-Aware Imputation via Learning Missing Data Mechanisms

About

Missing data is an important problem in machine learning practice. Starting from the premise that imputation methods should preserve the causal structure of the data, we develop a regularization scheme that encourages any baseline imputation method to be causally consistent with the underlying data generating mechanism. Our proposal is a causally-aware imputation algorithm (MIRACLE). MIRACLE iteratively refines the imputation of a baseline by simultaneously modeling the missingness generating mechanism, encouraging imputation to be consistent with the causal structure of the data. We conduct extensive experiments on synthetic and a variety of publicly available datasets to show that MIRACLE is able to consistently improve imputation over a variety of benchmark methods across all three missingness scenarios: at random, completely at random, and not at random.

Trent Kyono, Yao Zhang, Alexis Bellot, Mihaela van der Schaar• 2021

Related benchmarks

TaskDatasetResultRank
ClassificationMusk2 downstream
Balanced Accuracy93.7
45
Data ImputationGliomas
Accuracy77.73
30
Data ImputationNPHA
Accuracy56.05
30
Data ImputationCancer
Accuracy32.07
28
Data ImputationDiabetes (1/3 omitted)
Accuracy52.47
16
RegressionEnergy 0% non-corrupted features
RMSE0.311
15
RegressionEnergy 50% non-corrupted features
RMSE0.335
15
RegressionEnergy 100% non-corrupted features
RMSE0.343
15
Data ImputationConcrete
MAE0.132
14
Data ImputationWine
MAE0.082
14
Showing 10 of 14 rows

Other info

Follow for update