Modelling Cellular Perturbations with the Sparse Additive Mechanism Shift Variational Autoencoder

About

Generative models of observations under interventions have been a vibrant topic of interest across machine learning and the sciences in recent years. For example, in drug discovery, there is a need to model the effects of diverse interventions on cells in order to characterize unknown biological mechanisms of action. We propose the Sparse Additive Mechanism Shift Variational Autoencoder, SAMS-VAE, to combine compositionality, disentanglement, and interpretability for perturbation models. SAMS-VAE models the latent state of a perturbed sample as the sum of a local latent variable capturing sample-specific variation and sparse global variables of latent intervention effects. Crucially, SAMS-VAE sparsifies these global latent variables for individual perturbations to identify disentangled, perturbation-specific latent subspaces that are flexibly composable. We evaluate SAMS-VAE both quantitatively and qualitatively on a range of tasks using two popular single cell sequencing datasets. In order to measure perturbation-specific model-properties, we also introduce a framework for evaluation of perturbation models based on average treatment effects with links to posterior predictive checks. SAMS-VAE outperforms comparable models in terms of generalization across in-distribution and out-of-distribution tasks, including a combinatorial reasoning task under resource paucity, and yields interpretable latent structures which correlate strongly to known biological mechanisms. Our results suggest SAMS-VAE is an interesting addition to the modeling toolkit for machine learning-driven scientific discovery.

Michael Bereket, Theofanis Karaletsos• 2023

Related benchmarks

Task	Dataset	Result
Perturbation response modeling	Srivatsan20	Cosine logFC0.53	20
Perturbation response modelling	Norman19	Cosine logFC0.78	20
Perturbation response modelling	Jiang24	Cosine logFC0.59	19
Combo prediction	Norman 19	MMD GEX4.1	14
Covariate transfer task	Srivatsan20 (test)	MMD GEX2.5	14
Single-cell perturbation prediction	E-MTAB-14065 (LOO mode)	Delta Pearson (DE genes)0.7531	10
Single-cell perturbation prediction	hiPSC temporal CRISPR LOO All genes	RMSE0.182	10
Single-cell perturbation prediction	hiPSC temporal CRISPR LOO, DE genes	AUC-ROC94.1	9
Perturbation prediction	Norman 2019 (Single-gene)	RMSE (10 genes)0.4114	6
Perturbation prediction	Norman2019 (double-gene)	RMSE (10 genes)0.4605	6

Showing 10 of 15 rows

Other info

Follow for update

@wizwand_team Discord