Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation

About

Predicting molecular conformations from molecular graphs is a fundamental problem in cheminformatics and drug discovery. Recently, significant progress has been achieved with machine learning approaches, especially with deep generative models. Inspired by the diffusion process in classical non-equilibrium thermodynamics where heated particles will diffuse from original states to a noise distribution, in this paper, we propose a novel generative model named GeoDiff for molecular conformation prediction. GeoDiff treats each atom as a particle and learns to directly reverse the diffusion process (i.e., transforming from a noise distribution to stable conformations) as a Markov chain. Modeling such a generation process is however very challenging as the likelihood of conformations should be roto-translational invariant. We theoretically show that Markov chains evolving with equivariant Markov kernels can induce an invariant distribution by design, and further propose building blocks for the Markov kernels to preserve the desirable equivariance property. The whole framework can be efficiently trained in an end-to-end fashion by optimizing a weighted variational lower bound to the (conditional) likelihood. Experiments on multiple benchmarks show that GeoDiff is superior or comparable to existing state-of-the-art approaches, especially on large molecules.

Minkai Xu, Lantao Yu, Yang Song, Chence Shi, Stefano Ermon, Jian Tang• 2022

Related benchmarks

TaskDatasetResultRank
Molecule Conformer GenerationGEOM-Drugs δ = 0.75Å (test)
COV-R (mean)42.1
30
Conformer GenerationGEOM-QM9 δ = 0.5Å (test)
Recall COV Mean76.5
30
Conformer GenerationRing puckering (unrelaxed)
Precision AMR (Å)28
12
Molecular Conformation GenerationMolecular Conformation (test)
Precision AMR (Å)0.28
12
Conformation GenerationGEOM-QM9
Mean COV-R80.36
8
Conformation GenerationGEOM-QM9 Domain Generalization
Coverage Recall Mean74.94
7
Unconditional GenerationGEOM-DRUG (test)
AS77.7
6
Ring puckeringRing Puckering Relaxed with MMFF
Precision AMR (Å)0.17
6
Ring Puckering EstimationRing puckering (unrelaxed)
Precision AMR (Å)0.24
6
Molecular Conformation GenerationGEOM Drugs
COV-R Mean7.99
4
Showing 10 of 10 rows

Other info

Follow for update