Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GenMol: A Drug Discovery Generalist with Discrete Diffusion

About

Drug discovery is a complex process that involves multiple stages and tasks. However, existing molecular generative models can only tackle some of these tasks. We present Generalist Molecular generative model (GenMol), a versatile framework that uses only a single discrete diffusion model to handle diverse drug discovery scenarios. GenMol generates Sequential Attachment-based Fragment Embedding (SAFE) sequences through non-autoregressive bidirectional parallel decoding, thereby allowing the utilization of a molecular context that does not rely on the specific token ordering while having better sampling efficiency. GenMol uses fragments as basic building blocks for molecules and introduces fragment remasking, a strategy that optimizes molecules by regenerating masked fragments, enabling effective exploration of chemical space. We further propose molecular context guidance (MCG), a guidance method tailored for masked discrete diffusion of GenMol. GenMol significantly outperforms the previous GPT-based model in de novo generation and fragment-constrained generation, and achieves state-of-the-art performance in goal-directed hit generation and lead optimization. These results demonstrate that GenMol can tackle a wide range of drug discovery tasks, providing a unified and versatile approach for molecular design. Our code is available at https://github.com/NVIDIA-Digital-Bio/genmol.

Seul Lee, Karsten Kreis, Srimukh Prasad Veccham, Meng Liu, Danny Reidenbach, Yuxing Peng, Saee Paliwal, Weili Nie, Arash Vahdat• 2025

Related benchmarks

TaskDatasetResultRank
Unconditional molecular generationMOSES
Validity99.7
39
Molecular Generationparp1
Top-Hit 5% Docking Score (kcal/mol)-11.773
29
Molecular Generationfa7
Top-Hit 5% Docking Score (kcal/mol)-8.967
29
Molecular Generation5ht1b
Docking Score (Top-Hit 5%, kcal/mol)-11.914
29
Molecular Generationjak2
Top-Hit 5% Docking Score (kcal/mol)-10.417
29
Molecular Generationbraf
Top-Hit 5% Docking Score (kcal/mol)-11.394
28
De novo small molecule generationSAFE (test)
Validity96.7
22
Goal-directed molecular optimizationPMO
Amlodipine MPO0.81
20
De Novo Molecular GenerationZINC Curated 22 (test)
Validity (%)0.999
17
Molecular Generationfa7
#Circles2.3
12
Showing 10 of 21 rows

Other info

Follow for update