Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Bayesian Flow Is All You Need to Sample Out-of-Distribution Chemical Spaces

About

Generating novel molecules with higher properties than the training space, namely the out-of-distribution generation, is important for de novo drug design. However, it is not easy for distribution learning-based models, for example diffusion models, to solve this challenge as these methods are designed to fit the distribution of training data as close as possible. In this paper, we show that Bayesian flow network, especially ChemBFN model, is capable of intrinsically generating high quality out-of-distribution samples that meet several scenarios. A reinforcement learning strategy is added to the ChemBFN and a controllable ordinary differential equation solver-like generating process is employed that accelerate the sampling processes. Most importantly, we introduce a semi-autoregressive strategy during training and inference that enhances the model performance and surpass the state-of-the-art models. A theoretical analysis of out-of-distribution generation in ChemBFN with semi-autoregressive approach is included as well.

Nianze Tao, Minori Abe• 2024

Related benchmarks

TaskDatasetResultRank
Molecular Generation5ht1b
Docking Score (Top-Hit 5%, kcal/mol)-12.609
27
Molecular Generationparp1
Top-Hit 5% Docking Score (kcal/mol)-12.455
27
Molecular Generationfa7
Top-Hit 5% Docking Score (kcal/mol)-9.527
27
Molecular Generationjak2
Top-Hit 5% Docking Score (kcal/mol)-11.69
27
Molecular Generationbraf
Top-Hit 5% Docking Score (kcal/mol)-12.061
26
Molecular Generationfa7
Novel Hit Ratio585.3
10
Molecular Generationbraf
Novel Hit Ratio534
10
Molecular Generationparp1
Novel Hit Ratio559.3
10
Molecular Generationjak2
Novel Hit Ratio526
10
Molecular Generation5ht1b
Novel Hit Ratio4.587
10
Showing 10 of 13 rows

Other info

Follow for update