Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

What Makes a Representation Good for Single-Cell Perturbation Prediction?

About

Single-cell perturbation modeling is fundamental for understanding and predicting cellular responses to genetic perturbations. However, existing approaches, from causal representation learning to foundation models, often struggle with an overlooked challenge: gene expression is dominated by perturbation-invariant information, while perturbation-specific signals are intrinsically sparse. As a result, learned representations either entangle invariant and perturbation-specific information, leading to spurious and non-generalizable predictors, or suppress perturbation-specific signals altogether, rendering them ineffective for prediction. To address this, we propose PerturbedVAE, a general framework designed to resolve this signal imbalance. The framework explicitly separates perturbation-specific information from dominant invariant structure and recovers causal representations to effectively utilize such information for prediction. We further provide an identifiability analysis that characterizes the conditions under which sparse perturbation effects can be reliably recovered, thereby clarifying how the framework can be concretely specified under such conditions. Empirically, PerturbedVAE achieves state-of-the-art performance on a widely used benchmark across multiple evaluation settings, yielding significant gains on out-of-distribution combinatorial predictions and uncovering interpretable perturbation-response programs.

Wenkang Jiang, Yuhang Liu, Yichao Cai, Erdun Gao, Jiayi Dong, Ehsan Abbasnejad, Lina Yao, Javen Qinfeng Shi• 2026

Related benchmarks

TaskDatasetResultRank
Perturbation predictionPerturb-seq double-gene perturbation
RMSE0.4474
8
Perturbation predictionNorman 2019 (Single-gene)
RMSE (10 genes)0.4027
6
Perturbation predictionNorman2019 (double-gene)
RMSE (10 genes)0.4493
6
Single-cell perturbation predictionReplogle2022 (single-gene i.i.d.)
L2 Distance8.296
5
Showing 4 of 4 rows

Other info

Follow for update