EvoEGF-Mol: Evolving Exponential Geodesic Flow for Structure-based Drug Design
About
Structure-Based Drug Design (SBDD) aims to discover bioactive ligands. Conventional approaches construct probability paths separately in Euclidean and probabilistic spaces for continuous atomic coordinates and discrete chemical categories, leading to a mismatch with the underlying statistical manifolds. We address this issue from an information-geometric perspective by modeling molecules as composite exponential-family distributions and defining generative flows along exponential geodesics under the Fisher-Rao metric. To avoid the instantaneous trajectory collapse induced by geodesics directly targeting Dirac distributions, we propose Evolving Exponential Geodesic Flow for SBDD (EvoEGF-Mol), which replaces static Dirac targets with dynamically concentrating distributions, ensuring stable training via a progressive-parameter-refinement architecture. Our model approaches a reference-level PoseBusters passing rate (93.4%) on CrossDock, demonstrating remarkable geometric precision and interaction fidelity, while outperforming baselines on real-world MolGenBench tasks by recovering bioactive scaffolds and generating candidates that meet established MedChem filters.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| structure-based drug design | MolGenBench Proteins in CrossDock | Pass Rate37.52 | 10 | |
| structure-based drug design | MolGenBench In(RM.): Proteins in CrossDock, remove SMILES in CrossDock (train) | Hit Recovery500 | 10 | |
| structure-based drug design | MolGenBench Not: Proteins not in CrossDock | Pass Rate33.75 | 10 | |
| structure-based drug design | CrossDock 2020 (test) | PB Valid93.4 | 6 |