Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Graphical Models for Processing Missing Data

About

This paper reviews recent advances in missing data research using graphical models to represent multivariate dependencies. We first examine the limitations of traditional frameworks from three different perspectives: \textit{transparency, estimability and testability}. We then show how procedures based on graphical models can overcome these limitations and provide meaningful performance guarantees even when data are Missing Not At Random (MNAR). In particular, we identify conditions that guarantee consistent estimation in broad categories of missing data problems, and derive procedures for implementing this estimation. Finally we derive testable implications for missing data models in both MAR (Missing At Random) and MNAR categories.

Karthika Mohan, Judea Pearl• 2018

Related benchmarks

TaskDatasetResultRank
Biomarker-level imputationSemi-synthetic biomarkers 80% missingness
MAE (Mean Absolute Error)2.875
7
Biomarker-level imputationSemi-synthetic biomarkers 30% missingness
MAE2.755
7
Biomarker-level imputationSemi-synthetic biomarkers 50% missingness
MAE2.833
7
ImputationSemi-synthetic EHR dataset MNAR 30% (test)
MAE5
6
ImputationSemi-synthetic EHR dataset MNAR 50% (test)
MAE5
6
ImputationSemi-synthetic EHR dataset MNAR 80% (test)
Mean Absolute Error (MAE)5
6
ImputationSemi-synthetic EHR dataset Pooled 30-80% MNAR (summary)
Mean Rank3.92
6
Showing 7 of 7 rows

Other info

Follow for update