Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

GenMol: A Drug Discovery Generalist with Discrete Diffusion

About

Drug discovery is a complex process that involves multiple stages and tasks. However, existing molecular generative models can only tackle some of these tasks. We present Generalist Molecular generative model (GenMol), a versatile framework that uses only a single discrete diffusion model to handle diverse drug discovery scenarios. GenMol generates Sequential Attachment-based Fragment Embedding (SAFE) sequences through non-autoregressive bidirectional parallel decoding, thereby allowing the utilization of a molecular context that does not rely on the specific token ordering while having better sampling efficiency. GenMol uses fragments as basic building blocks for molecules and introduces fragment remasking, a strategy that optimizes molecules by regenerating masked fragments, enabling effective exploration of chemical space. We further propose molecular context guidance (MCG), a guidance method tailored for masked discrete diffusion of GenMol. GenMol significantly outperforms the previous GPT-based model in de novo generation and fragment-constrained generation, and achieves state-of-the-art performance in goal-directed hit generation and lead optimization. These results demonstrate that GenMol can tackle a wide range of drug discovery tasks, providing a unified and versatile approach for molecular design. Our code is available at https://github.com/NVIDIA-Digital-Bio/genmol.

Seul Lee, Karsten Kreis, Srimukh Prasad Veccham, Meng Liu, Danny Reidenbach, Yuxing Peng, Saee Paliwal, Weili Nie, Arash Vahdat• 2025

Related benchmarks

TaskDatasetResultRank
Molecular Generationparp1
Top-Hit 5% Docking Score (kcal/mol)-11.773
27
Molecular Generationfa7
Top-Hit 5% Docking Score (kcal/mol)-8.967
27
Molecular Generation5ht1b
Docking Score (Top-Hit 5%, kcal/mol)-11.914
27
Molecular Generationjak2
Top-Hit 5% Docking Score (kcal/mol)-10.417
27
Molecular Generationbraf
Top-Hit 5% Docking Score (kcal/mol)-11.394
26
De novo small molecule generationSAFE (test)
Validity96.7
22
Unconditional molecular generationMOSES
Validity99.7
20
De Novo Molecular GenerationZINC Curated 22 (test)
Validity (%)0.999
17
Goal-directed molecular optimizationPMO
Albuterol Similarity0.937
16
Molecular Generationfa7
#Circles2.3
12
Showing 10 of 20 rows

Other info

Follow for update