Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Conformal Drug Property Prediction with Density Estimation under Covariate Shift

About

In drug discovery, it is vital to confirm the predictions of pharmaceutical properties from computational models using costly wet-lab experiments. Hence, obtaining reliable uncertainty estimates is crucial for prioritizing drug molecules for subsequent experimental validation. Conformal Prediction (CP) is a promising tool for creating such prediction sets for molecular properties with a coverage guarantee. However, the exchangeability assumption of CP is often challenged with covariate shift in drug discovery tasks: Most datasets contain limited labeled data, which may not be representative of the vast chemical space from which molecules are drawn. To address this limitation, we propose a method called CoDrug that employs an energy-based model leveraging both training data and unlabelled data, and Kernel Density Estimation (KDE) to assess the densities of a molecule set. The estimated densities are then used to weigh the molecule samples while building prediction sets and rectifying for distribution shift. In extensive experiments involving realistic distribution drifts in various small-molecule drug discovery tasks, we demonstrate the ability of CoDrug to provide valid prediction sets and its utility in addressing the distribution shift arising from de novo drug design models. On average, using CoDrug can reduce the coverage gap by over 35% when compared to conformal prediction sets not adjusted for covariate shift.

Siddhartha Laghuvarapu, Zhen Lin, Jimeng Sun• 2023

Related benchmarks

TaskDatasetResultRank
De Novo Drug DesignQED alpha=0.05
Observed Coverage100
16
Conformal Prediction CoverageREINVENT generated molecules (test)
Coverage96
12
Conformal Prediction CoverageGraphGA generated molecules (test)
Observed Coverage98
12
De Novo Drug DesignQED alpha=0.2
Observed Coverage90
8
De Novo Drug DesignJNK3+QED alpha=0.05
Observed Coverage98
8
De Novo Drug DesignJNK3+QED alpha=0.2
Observed Coverage86
8
De Novo Drug DesignGSK3b+QED alpha=0.2
Observed Coverage90
8
Conformal PredictionAMES Y=1 (Fingerprint Splitting)
Coverage90
3
Conformal PredictionClinTox (Y=0) (Fingerprint Splitting)
Coverage86
3
Conformal PredictionHIV Y=1 Fingerprint Splitting
Coverage95
3
Showing 10 of 55 rows

Other info

Follow for update