When Does Context Help? A Systematic Study of Target-Conditional Molecular Property Prediction

About

We present the first systematic study of when target context helps molecular property prediction, evaluating context conditioning across 10 diverse protein families, 4 fusion architectures, data regimes spanning 67-9,409 training compounds, and both temporal and random evaluation splits. Using NestDrug, a FiLM-based architecture that conditions molecular representations on target identity, we characterize both success and failure modes with three principal findings. First, fusion architecture dominates: FiLM outperforms concatenation by 24.2 percentage points and additive conditioning by 8.6 pp; how you incorporate context matters more than whether you include it. Second, context enables otherwise impossible predictions: on data-scarce CYP3A4 (67 training compounds), multi-task transfer achieves 0.686 AUC where per-target Random Forest collapses to 0.238. Third, context can systematically hurt: distribution mismatch causes 10.2 pp degradation on BACE1; few-shot adaptation consistently underperforms zero-shot. Beyond methodology, we expose fundamental flaws in standard benchmarking: 1-nearest-neighbor Tanimoto achieves 0.991 AUC on DUD-E without any learning, and 50% of actives leak from training data, rendering absolute performance metrics meaningless. Our temporal split evaluation (train up to 2020, test 2021-2024) achieves stable 0.843 AUC with no degradation, providing the first rigorous evidence that context-conditional molecular representations generalize to future chemical space.

Bryan Cheng, Jasper Zhang• 2026

Related benchmarks

Task	Dataset	Result
Hit discovery (DMTA replay)	DUD-E	Hit Rate87.6	18
Virtual Screening	DUD-E 10 diverse targets	Mean AUC85	6
Molecular property prediction	ChEMBL temporal split 2021 (test)	ROC-AUC0.849	1
Molecular property prediction	ChEMBL temporal 2022 (test)	ROC-AUC0.838	1
Molecular property prediction	ChEMBL temporal split 2023 (test)	ROC-AUC82.3	1
Molecular property prediction	ChEMBL temporal split 2024 (test)	ROC-AUC84.9	1
Molecular property prediction	ChEMBL 2021-2024 temporal split overall (test)	ROC-AUC0.843	1

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord