Population-based de novo molecule generation, using grammatical evolution
About
Automatic design with machine learning and molecular simulations has shown a remarkable ability to generate new and promising drug candidates. Current models, however, still have problems in simulation concurrency and molecular diversity. Most methods generate one molecule at a time and do not allow multiple simulators to run simultaneously. Additionally, better molecular diversity could boost the success rate in the subsequent drug discovery process. We propose a new population-based approach using grammatical evolution named ChemGE. In our method, a large population of molecules are updated concurrently and evaluated by multiple simulators in parallel. In docking experiments with thymidine kinase, ChemGE succeeded in generating hundreds of high-affinity molecules whose diversity is better than that of known inding molecules in DUD-E.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| de novo molecular design | GuacaMol goal-directed tasks | Osimertinib MPO Score0.886 | 23 | |
| Molecule Optimization | GuacaMol (test) | Total Score Sum4.732 | 8 | |
| Molecule Optimization | GuacaMol v1 | Med1 Score20.7 | 8 |