Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Molecular Contrastive Learning of Representations via Graph Neural Networks

About

Molecular Machine Learning (ML) bears promise for efficient molecule property prediction and drug discovery. However, labeled molecule data can be expensive and time-consuming to acquire. Due to the limited labeled data, it is a great challenge for supervised-learning ML models to generalize to the giant chemical space. In this work, we present MolCLR: Molecular Contrastive Learning of Representations via Graph Neural Networks (GNNs), a self-supervised learning framework that leverages large unlabeled data (~10M unique molecules). In MolCLR pre-training, we build molecule graphs and develop GNN encoders to learn differentiable representations. Three molecule graph augmentations are proposed: atom masking, bond deletion, and subgraph removal. A contrastive estimator maximizes the agreement of augmentations from the same molecule while minimizing the agreement of different molecules. Experiments show that our contrastive learning framework significantly improves the performance of GNNs on various molecular property benchmarks including both classification and regression tasks. Benefiting from pre-training on the large unlabeled database, MolCLR even achieves state-of-the-art on several challenging benchmarks after fine-tuning. Additionally, further investigations demonstrate that MolCLR learns to embed molecules into representations that can distinguish chemically reasonable molecular similarities.

Yuyang Wang, Jianren Wang, Zhonglin Cao, Amir Barati Farimani• 2021

Related benchmarks

TaskDatasetResultRank
Molecular property predictionMoleculeNet BBBP (scaffold)
ROC AUC85
117
Molecular property predictionMoleculeNet SIDER (scaffold)
ROC-AUC0.659
97
Molecular property predictionMoleculeNet BACE (scaffold)
ROC-AUC89
87
Molecular property predictionMoleculeNet MUV (scaffold)
ROC-AUC0.838
68
Molecular property predictionMoleculeNet HIV (scaffold)
ROC AUC81.2
66
Molecular property predictionBACE (test)
ROC-AUC89
65
Molecular property predictionBBBP (test)
ROC-AUC0.678
64
molecule property predictionMoleculeNet (scaffold split)
BBBP72.2
58
Molecular property predictionTox21 (test)
ROC-AUC0.751
53
Molecular property predictionSIDER (test)
ROC-AUC0.598
53
Showing 10 of 36 rows

Other info

Follow for update