Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

not-MIWAE: Deep Generative Modelling with Missing not at Random Data

About

When a missing process depends on the missing values themselves, it needs to be explicitly modelled and taken into account while doing likelihood-based inference. We present an approach for building and fitting deep latent variable models (DLVMs) in cases where the missing process is dependent on the missing data. Specifically, a deep neural network enables us to flexibly model the conditional distribution of the missingness pattern given the data. This allows for incorporating prior information about the type of missingness (e.g. self-censoring) into the model. Our inference technique, based on importance-weighted variational inference, involves maximising a lower bound of the joint likelihood. Stochastic gradients of the bound are obtained by using the reparameterisation trick both in latent space and data space. We show on various kinds of data sets and missingness patterns that explicitly modelling the missing process can be invaluable.

Niels Bruun Ipsen, Pierre-Alexandre Mattei, Jes Frellsen• 2020

Related benchmarks

TaskDatasetResultRank
ClassificationAust. 30% MCAR
F1 Score65.8
12
ClassificationAdult 30% MAR
F1 Score29.4
12
ClassificationAdult 30% MCAR
F1 Score24.5
12
ClassificationWine 30% MNAR
F1 Score87.5
12
ClassificationBreast 30% MCAR
F1 Score42.4
12
ClassificationBank 30% MCAR
F1 Score73.5
12
ClassificationAdult 30% MNAR
F1 Score20.1
12
Missing Data ImputationHeart 30% MCAR
Average Error0.174
11
Missing Data ImputationHous. 30% MCAR
Average Error0.075
11
Missing Data ImputationYacht 30% MCAR
Avg Error0.175
11
Showing 10 of 45 rows

Other info

Follow for update