FP-GNN: a versatile deep learning architecture for enhanced molecular property prediction
About
Deep learning is an important method for molecular design and exhibits considerable ability to predict molecular properties, including physicochemical, bioactive, and ADME/T (absorption, distribution, metabolism, excretion, and toxicity) properties. In this study, we advanced a novel deep learning architecture, termed FP-GNN, which combined and simultaneously learned information from molecular graphs and fingerprints. To evaluate the FP-GNN model, we conducted experiments on 13 public datasets, an unbiased LIT-PCBA dataset, and 14 phenotypic screening datasets for breast cell lines. Extensive evaluation results showed that compared to advanced deep learning and conventional machine learning algorithms, the FP-GNN algorithm achieved state-of-the-art performance on these datasets. In addition, we analyzed the influence of different molecular fingerprints, and the effects of molecular graphs and molecular fingerprints on the performance of the FP-GNN model. Analysis of the anti-noise ability and interpretation ability also indicated that FP-GNN was competitive in real-world situations.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Molecular property prediction | BACE | ROC-AUC88.1 | 55 | |
| Molecular property prediction | BBBP | ROC AUC0.935 | 48 | |
| Molecular property prediction | ClinTox | ROC AUC84 | 47 | |
| Molecular Property Prediction (Regression) | ESOL | RMSE0.675 | 36 | |
| Molecular Property Prediction (Regression) | Lipophilicity | RMSE0.625 | 34 | |
| Regression | FreeSolv | RMSE0.905 | 33 | |
| Molecular property prediction | Tox21 | ROC AUC81.5 | 29 | |
| Predicting interactions with proteins | LIT-PCBA (test) | ROC-AUC0.759 | 24 | |
| Molecular Property Regression | PharmaBench | CYP2C9 Prediction16.933 | 15 |