MechPert: Mechanistic Consensus as an Inductive Bias for Unseen Perturbation Prediction
About
Predicting transcriptional responses to unseen genetic perturbations is essential for understanding gene regulation and prioritizing large-scale perturbation experiments. Existing approaches either rely on static, potentially incomplete knowledge graphs, or prompt language models for functionally similar genes, retrieving associations shaped by symmetric co-occurrence in scientific text rather than directed regulatory logic. We introduce MechPert, a lightweight framework that encourages LLM agents to generate directed regulatory hypotheses rather than relying solely on functional similarity. Multiple agents independently propose candidate regulators with associated confidence scores; these are aggregated through a consensus mechanism that filters spurious associations, producing weighted neighborhoods for downstream prediction. We evaluate MechPert on Perturb-seq benchmarks across four human cell lines. For perturbation prediction in low-data regimes ($N=50$ observed perturbations), MechPert improves Pearson correlation by up to 10.5\% over similarity-based baselines. For experimental design, MechPert-selected anchor genes outperform standard network centrality heuristics by up to 46\% in well-characterized cell lines.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Few-Shot Generalization | K562, RPE1, Jurkat, HepG2 pooled cell lines | C20 Correlation0.62 | 15 | |
| Global genome prediction | K562 cell line | Pearson Correlation0.536 | 4 | |
| Global genome prediction | RPE1 cell line | Pearson Correlation Coefficient0.614 | 4 | |
| Global genome prediction | Jurkat cell line | Pearson Correlation Coefficient0.26 | 4 | |
| Global genome prediction | HepG2 cell line | Pearson Correlation (PCC)0.181 | 4 |