Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Exploring Chemical Space with Score-based Out-of-distribution Generation

About

A well-known limitation of existing molecular generative models is that the generated molecules highly resemble those in the training set. To generate truly novel molecules that may have even better properties for de novo drug discovery, more powerful exploration in the chemical space is necessary. To this end, we propose Molecular Out-Of-distribution Diffusion(MOOD), a score-based diffusion scheme that incorporates out-of-distribution (OOD) control in the generative stochastic differential equation (SDE) with simple control of a hyperparameter, thus requires no additional costs. Since some novel molecules may not meet the basic requirements of real-world drugs, MOOD performs conditional generation by utilizing the gradients from a property predictor that guides the reverse-time diffusion process to high-scoring regions according to target properties such as protein-ligand interactions, drug-likeness, and synthesizability. This allows MOOD to search for novel and meaningful molecules rather than generating unseen yet trivial ones. We experimentally validate that MOOD is able to explore the chemical space beyond the training distribution, generating molecules that outscore ones found with existing methods, and even the top 0.01% of the original training pool. Our code is available at https://github.com/SeulLee05/MOOD.

Seul Lee, Jaehyeong Jo, Sung Ju Hwang• 2022

Related benchmarks

TaskDatasetResultRank
Molecular Generation5ht1b
Docking Score (Top-Hit 5%, kcal/mol)-11.145
29
Molecular Generationparp1
Top-Hit 5% Docking Score (kcal/mol)-10.865
29
Molecular Generationjak2
Top-Hit 5% Docking Score (kcal/mol)-10.147
29
Molecular Generationfa7
Top-Hit 5% Docking Score (kcal/mol)-8.16
29
Molecular Generationbraf
Top-Hit 5% Docking Score (kcal/mol)-11.063
28
Conditional 2D Molecular Graph GenerationSynth. & BBBP
Diversity92.73
14
Conditional 2D Molecular Graph GenerationSynth. & HIV
Diversity92.8
14
Conditional 2D Molecular Graph GenerationSynth. & BACE
Diversity89.02
14
3D Molecule GenerationQM9
P (0 rings)80.7
13
Molecular Generationparp1
Novel Hit Ratio701.7
12
Showing 10 of 40 rows

Other info

Follow for update