3D Equivariant Diffusion for Target-Aware Molecule Generation and Affinity Prediction
About
Rich data and powerful machine learning models allow us to design drugs for a specific protein target \textit{in silico}. Recently, the inclusion of 3D structures during targeted drug design shows superior performance to other target-free models as the atomic interaction in the 3D space is explicitly modeled. However, current 3D target-aware models either rely on the voxelized atom densities or the autoregressive sampling process, which are not equivariant to rotation or easily violate geometric constraints resulting in unrealistic structures. In this work, we develop a 3D equivariant diffusion model to solve the above challenges. To achieve target-aware molecule design, our method learns a joint generative process of both continuous atom coordinates and categorical atom types with a SE(3)-equivariant network. Moreover, we show that our model can serve as an unsupervised feature extractor to estimate the binding affinity under proper parameterization, which provides an effective way for drug screening. To evaluate our model, we propose a comprehensive framework to evaluate the quality of sampled molecules from different dimensions. Empirical studies show our model could generate molecules with more realistic 3D structures and better affinities towards the protein targets, and improve binding affinity ranking and prediction without retraining.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Molecule Generation | Pocket dataset | Time (s)3.43e+3 | 12 | |
| structure-based drug design | CrossDocked 2020 (test) | Top-1 Docking Score (Avg)-9.38 | 11 | |
| structure-based drug design | MolGenBench In(RM.): Proteins in CrossDock, remove SMILES in CrossDock (train) | Hit Recovery56 | 10 | |
| Structure-conditioned molecular generation | CrossDocked 2020 (test) | Vina Dock Score-7.8 | 10 | |
| structure-based drug design | MolGenBench Proteins in CrossDock | Pass Rate5.99 | 10 | |
| structure-based drug design | MolGenBench Not: Proteins not in CrossDock | Pass Rate6.64 | 10 | |
| Dual-target drug design | 12,917 pairs of targets dual-target setting (test) | P-1 Vina Dock (Avg)-8.62 | 7 | |
| Dual-conditioned molecular generation | 3o96_A pocket | Vina Dock-10.8 | 7 | |
| Dual-target ligand generation | DDF GSK3β & JNK3 (test) | Average Vina Docking Score (P1)-8.95 | 6 | |
| De novo structure-based drug design | DDF (test) | QED0.54 | 6 |