A Graph to Graphs Framework for Retrosynthesis Prediction
About
A fundamental problem in computational chemistry is to find a set of reactants to synthesize a target molecule, a.k.a. retrosynthesis prediction. Existing state-of-the-art methods rely on matching the target molecule with a large set of reaction templates, which are very computationally expensive and also suffer from the problem of coverage. In this paper, we propose a novel template-free approach called G2Gs by transforming a target molecular graph into a set of reactant molecular graphs. G2Gs first splits the target molecular graph into a set of synthons by identifying the reaction centers, and then translates the synthons to the final reactant graphs via a variational graph translation framework. Experimental results show that G2Gs significantly outperforms existing template-free approaches by up to 63% in terms of the top-1 accuracy and achieves a performance close to that of state-of-the-art template based approaches, but does not require domain knowledge and is much more scalable.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Retrosynthesis | USPTO-50k Reaction type unknown (test) | Top-1 Accuracy48.9 | 59 | |
| Retrosynthesis | USPTO-50k Reaction type known (test) | Top-1 Accuracy61 | 50 | |
| Retrosynthesis prediction | USPTO-50k (test) | Top-1 Accuracy61 | 39 | |
| Retrosynthesis | USPTO-50K | Top-1 Accuracy66.8 | 33 | |
| Retrosynthesis prediction | USPTO-50K | Top-1 Acc (Unknown)48.9 | 22 | |
| Single-step retrosynthesis | USPTO-50k (test) | Top-1 Accuracy48.9 | 18 | |
| Retrosynthesis | USPTO-50K unknown reaction types (test) | Top-1 Accuracy48.9 | 17 | |
| Retrosynthesis (reaction class not given) | USPTO-50k (test) | Top-1 Acc48.9 | 14 | |
| Center identification | USPTO-50K | Top-1 Accuracy90.2 | 8 | |
| Retrosynthesis prediction | USPTO-50k (40/5/5) | Top-1 Accuracy0.489 | 8 |