Amortized Molecular Optimization via Group Relative Policy Optimization

About

In structurally constrained molecular optimization, state-of-the-art methods restart an expensive oracle-driven search from scratch for every new input structure, scaling poorly to settings with many starting structures or expensive oracles. While amortized approaches that learn a transferable policy could in principle remove this bottleneck, existing methods struggle to generalize to diverse structural constraints at inference time. We present AMORTIX, an amortized Graph Transformer model that natively supports such constraints, optimizing molecular structures in a single forward pass with zero inference-time oracle calls. A central challenge for amortized training in this domain is that optimization difficulty varies drastically across starting structures. We show that, under this heterogeneity, standard reinforcement learning methods fail to stabilize training, and address this by normalizing rewards within groups of completions sharing the same starting structure. We evaluate on structurally constrained single- and multi-target kinase inhibitor design, and on a few-shot prodrug case study. AMORTIX outperforms both amortized and instance-optimization baselines on goal-directed scaffold decoration and ranks first among amortized methods on the PMO benchmark; the prodrug case study further demonstrates transfer of a learned modification rule to unseen drug structures. Code is available at https://github.com/Hash-hh/AMORTIX/.

Muhammad bin Javaid, Hasham Hussain, Ashima Khanna, Berke Kisin, Jonathan Pirnay, Alexander Mitsos, Dominik G. Grimm, Martin Grohe• 2026

Related benchmarks

Task	Dataset	Result	Rank
Goal-directed molecular optimization	PMO	Amlodipine MPO0.666		24
Scaffold-constrained molecular optimization	Kinase Scaffold Decoration (test)	Objective Score0.618		6

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord