Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Amortized Variational Inference for Logistic Regression with Missing Covariates

About

Missing covariate data pose a significant challenge to statistical inference and machine learning, particularly for classification tasks like logistic regression. Classical iterative approaches (EM, multiple imputation) are often computationally intensive, sensitive to high missingness rates, and limited in uncertainty propagation. Recent deep generative models based on VAEs show promise but rely on complex latent representations. We propose Amortized Variational Inference for Logistic Regression (AV-LR), a unified end-to-end framework for binary logistic regression with missing covariates. AV-LR integrates a probabilistic generative model with a simple amortized inference network, trained jointly by maximizing the evidence lower bound. Unlike competing methods, AV-LR performs inference directly in the space of missing data without additional latent variables, using a single inference network and a linear layer that jointly estimate regression parameters and the missingness mechanism. AV-LR achieves estimation accuracy comparable to or better than state-of-the-art EM-like algorithms, with significantly lower computational cost. It naturally extends to missing-not-at-random settings by explicitly modeling the missingness mechanism. Empirical results on synthetic and real-world datasets confirm its effectiveness and efficiency across various missing-data scenarios.

M. Cherifi, Aude Sportisse, Xujia Zhu, Mohammed Nabil El Korso, A. Mesloub• 2026

Related benchmarks

TaskDatasetResultRank
Classificationbanknote
AUC87.3
18
Classificationpima
AUC0.787
18
ClassificationRice
AUC0.977
18
ClassificationBreastcancer
AUC98.6
18
Execution time measurementBreast Cancer (50% MNAR)
Training Time (s)18.051
15
Execution time measurementPima 50% MNAR
Training Time (s)22.26
15
Execution time measurementBankNote 50% MNAR
Training Time42.539
15
Execution time measurementRice 50% MNAR
Training Time (s)105
15
Binary ClassificationSynthetic 50% MCAR (test)
AUC74.2
7
ClassificationSynthetic Dataset 60% MNAR (test)
AUC77.1
7
Showing 10 of 21 rows

Other info

Follow for update