Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

GeoMol: Torsional Geometric Generation of Molecular 3D Conformer Ensembles

About

Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery. Existing generative models have several drawbacks including lack of modeling important molecular geometry elements (e.g. torsion angles), separate optimization stages prone to error accumulation, and the need for structure fine-tuning based on approximate classical force-fields or computationally expensive methods such as metadynamics with approximate quantum mechanics calculations at each geometry. We propose GeoMol--an end-to-end, non-autoregressive and SE(3)-invariant machine learning approach to generate distributions of low-energy molecular 3D conformers. Leveraging the power of message passing neural networks (MPNNs) to capture local and global graph information, we predict local atomic 3D structures and torsion angles, avoiding unnecessary over-parameterization of the geometric degrees of freedom (e.g. one angle per non-terminal bond). Such local predictions suffice both for the training loss computation, as well as for the full deterministic conformer assembly (at test time). We devise a non-adversarial optimal transport based loss function to promote diverse conformer generation. GeoMol predominantly outperforms popular open-source, commercial, or state-of-the-art machine learning (ML) models, while achieving significant speed-ups. We expect such differentiable 3D structure generators to significantly impact molecular modeling and related applications.

Octavian-Eugen Ganea, Lagnajit Pattanaik, Connor W. Coley, Regina Barzilay, Klavs F. Jensen, William H. Green, Tommi S. Jaakkola• 2021

Related benchmarks

TaskDatasetResultRank
Conformer GenerationGEOM-QM9 δ = 0.5Å (test)
Recall COV Mean91.5
30
Molecule Conformer GenerationGEOM-Drugs δ = 0.75Å (test)
COV-R (mean)44.6
30
Conformer ensemble generationGEOM-DRUGS (test)
Coverage R Mean (%)91.34
12
Conformer ensemble generationGEOM-QM9 (test)
COV-R Mean91.52
10
Conformation GenerationGEOM-QM9
Mean COV-R71.26
8
Conformation GenerationGEOM-QM9 Domain Generalization
Coverage Recall Mean71.26
7
Conformer ensemble generationQM9 GEOM (test)
COV-R Mean0.9152
5
Conformer GenerationGEOM-DRUGS v1 (test)
Recall Coverage Mean44.6
5
Conformer GenerationGEOM-XL
AMR Recall Mean2.47
3
Showing 9 of 9 rows

Other info

Code

Follow for update