EvoEGF-Mol: Evolving Exponential Geodesic Flow for Structure-based Drug Design
About
Structure-Based Drug Design (SBDD) aims to discover bioactive ligands. Conventional approaches construct probability paths separately in Euclidean and probabilistic spaces for continuous atomic coordinates and discrete chemical categories, leading to a mismatch with the underlying statistical manifolds. We address this issue by representing molecules using composite exponential-family distributions, where coordinates and categories are represented within a unified natural parameter space to evolve synchronously along exponential geodesics under the Fisher-Rao metric. To avoid the instantaneous trajectory collapse induced by geodesics directly targeting Dirac distributions, we propose Evolving Exponential Geodesic Flow for SBDD (EvoEGF-Mol), which replaces static Dirac targets with dynamically concentrating distributions and is trained with a progressive-parameter-refinement architecture. Our model approaches a reference-level PoseBusters passing rate (93.4%) on CrossDock, demonstrating remarkable geometric precision and interaction fidelity, while achieving superior performance over baseline methods on real-world MolGenBench tasks for bioactive scaffold recovery. Code is available at https://github.com/BLEACH366/EvoEGF-Mol.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| structure-based drug design | MolGenBench Proteins in CrossDock | Pass Rate37.52 | 10 | |
| structure-based drug design | MolGenBench In(RM.): Proteins in CrossDock, remove SMILES in CrossDock (train) | Hit Recovery500 | 10 | |
| structure-based drug design | MolGenBench Not: Proteins not in CrossDock | Pass Rate33.75 | 10 | |
| structure-based drug design | CrossDock 2020 (test) | PB Valid93.4 | 6 |