When Does Context Help? A Systematic Study of Target-Conditional Molecular Property Prediction
About
We present the first systematic study of when target context helps molecular property prediction, evaluating context conditioning across 10 diverse protein families, 4 fusion architectures, data regimes spanning 67-9,409 training compounds, and both temporal and random evaluation splits. Using NestDrug, a FiLM-based architecture that conditions molecular representations on target identity, we characterize both success and failure modes with three principal findings. First, fusion architecture dominates: FiLM outperforms concatenation by 24.2 percentage points and additive conditioning by 8.6 pp; how you incorporate context matters more than whether you include it. Second, context enables otherwise impossible predictions: on data-scarce CYP3A4 (67 training compounds), multi-task transfer achieves 0.686 AUC where per-target Random Forest collapses to 0.238. Third, context can systematically hurt: distribution mismatch causes 10.2 pp degradation on BACE1; few-shot adaptation consistently underperforms zero-shot. Beyond methodology, we expose fundamental flaws in standard benchmarking: 1-nearest-neighbor Tanimoto achieves 0.991 AUC on DUD-E without any learning, and 50% of actives leak from training data, rendering absolute performance metrics meaningless. Our temporal split evaluation (train up to 2020, test 2021-2024) achieves stable 0.843 AUC with no degradation, providing the first rigorous evidence that context-conditional molecular representations generalize to future chemical space.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Hit discovery (DMTA replay) | DUD-E | Hit Rate87.6 | 18 | |
| Virtual Screening | DUD-E 10 diverse targets | Mean AUC85 | 6 | |
| Molecular property prediction | ChEMBL temporal split 2021 (test) | ROC-AUC0.849 | 1 | |
| Molecular property prediction | ChEMBL temporal 2022 (test) | ROC-AUC0.838 | 1 | |
| Molecular property prediction | ChEMBL temporal split 2023 (test) | ROC-AUC82.3 | 1 | |
| Molecular property prediction | ChEMBL temporal split 2024 (test) | ROC-AUC84.9 | 1 | |
| Molecular property prediction | ChEMBL 2021-2024 temporal split overall (test) | ROC-AUC0.843 | 1 |