Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Controllable Molecular Generative Foundation Models

About

Despite the success of foundation models in language and vision, molecular graph generation still lacks a unified framework for heterogeneous design tasks with reliable controllability. While reinforcement learning (RL) offers a natural post-training mechanism for task-specific optimization, applying it to graph generative models is hindered by the vast atom-wise action spaces and chemically invalid intermediate states. We propose \textbf{Co}ntrollable \textbf{Mole}cular Generative Foundation Models (CoMole), built with a unified motif-aware graph diffusion pipeline. By learning a motif-aware graph space, CoMole transfers pretrained structural priors into controllable generation, where RL optimizes conditional reverse policies over chemically meaningful decisions. We theoretically characterize the bottleneck of atom-level RL and justify motif-aware policy optimization. Across three heterogeneous benchmarks spanning materials and drug discovery, CoMole ranks first in controllability on all nine targets, reduces MAE by up to 48.2% relative to the strongest baselines, and maintains validity above 0.94 without rule-based correction or post-hoc filtering. We further show that CoMole transfers controllability to unseen properties by optimizing only task embeddings with the generator frozen, achieving performance competitive with strong task-specific baselines.

Yihan Zhu, Yuhan Liu, Weijiang Li, Tengfei Luo, Meng Jiang• 2026

Related benchmarks

TaskDatasetResultRank
Controllable Molecular GenerationMolecular and Polymer properties 9 properties aggregation (test)
Average Rank1
27
Conditional molecular generation10K Polymers (test)
Validity98.83
14
Heterogeneous Conditional Molecular Generation10K Polymers
Validity96.88
14
Heterogeneous Conditional Molecular Generation10K Molecules Drug-related task set
Validity96.68
14
Molecule GenerationPolymer and Drug datasets (test)
Novelty93.9
14
Controllable Molecular GenerationDFT unseen targets: Ei, EPS (test)
Validity91.82
5
Showing 6 of 6 rows

Other info

Follow for update