MIDA: Multiple Imputation using Denoising Autoencoders
About
Missing data is a significant problem impacting all domains. State-of-the-art framework for minimizing missing data bias is multiple imputation, for which the choice of an imputation model remains nontrivial. We propose a multiple imputation model based on overcomplete deep denoising autoencoders. Our proposed model is capable of handling different data types, missingness patterns, missingness proportions and distributions. Evaluation on several real life datasets show our proposed model significantly outperforms current state-of-the-art methods under varying conditions while simultaneously improving end of the line analytics.
Lovedeep Gondara, Ke Wang• 2017
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Missing data estimation | Biobank | Mean RMSE0.0805 | 13 | |
| Patient state prediction | MIMIC-III (test) | AUROC83.99 | 13 | |
| Patient state prediction | Biobank (test) | AUROC87.85 | 13 | |
| Missing data estimation | MIMIC-III v1.4 (test) | Mean RMSE0.0412 | 13 | |
| Missing data estimation | Deterioration | RMSE0.0309 | 13 | |
| Patient state prediction | Deterioration (test) | AUROC0.7488 | 13 | |
| Missing data estimation | UNOS Heart | Mean RMSE0.0589 | 12 | |
| Missing data estimation | UNOS-Lung | Mean RMSE0.0712 | 12 | |
| Patient state prediction | UNOS-Heart (test) | AUROC66.33 | 12 | |
| Patient state prediction | UNOS-Lung (test) | AUROC0.6574 | 12 |
Showing 10 of 19 rows