Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Synergistic Benefits of Joint Molecule Generation and Property Prediction

About

Modeling the joint distribution of data samples and their properties allows to construct a single model for both data generation and property prediction, with synergistic benefits reaching beyond purely generative or predictive models. However, training joint models presents daunting architectural and optimization challenges. Here, we propose Hyformer, a transformer-based joint model that successfully blends the generative and predictive functionalities, using an alternating attention mechanism and a joint pre-training scheme. We show that Hyformer is simultaneously optimized for molecule generation and property prediction, while exhibiting synergistic benefits in conditional sampling, out-of-distribution property prediction and representation learning. Finally, we demonstrate the benefits of joint learning in a drug design use case of discovering novel antimicrobial~peptides.

Adam Izdebski, Jan Olszewski, Pankhil Gawade, Krzysztof Koras, Serra Korkmaz, Valentin Rauscher, Jakub M. Tomczak, Ewa Szczurek• 2025

Related benchmarks

TaskDatasetResultRank
Molecule GenerationMOSES (test)
Validity99.6
33
Unconditional Molecule GenerationGuacaMol
Validity98.6
30
Molecular property predictionMoleculeNet Regression Subsets
ESOL Error0.774
14
Molecular property predictionMoleculeNet Classification Subsets
BBBP Accuracy75.9
14
Molecular Representation LearningMoleculeNet
ESOL1.256
14
Hit IdentificationDRD2-Hi Lo-Hi benchmark
AUPRC78.4
13
Hit IdentificationHIV-Hi Lo-Hi benchmark
AUPRC15.8
13
Hit IdentificationKDR-Hi Lo-Hi benchmark
AUPRC70.1
12
Hit IdentificationSol-Hi Lo-Hi
AUPRC0.64
12
Antimicrobial peptide designAMP design dataset
Perplexity17.98
5
Showing 10 of 10 rows

Other info

Follow for update