Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Conditioning by adaptive sampling for robust design

About

We present a new method for design problems wherein the goal is to maximize or specify the value of one or more properties of interest. For example, in protein design, one may wish to find the protein sequence that maximizes fluorescence. We assume access to one or more, potentially black box, stochastic "oracle" predictive functions, each of which maps from input (e.g., protein sequences) design space to a distribution over a property of interest (e.g. protein fluorescence). At first glance, this problem can be framed as one of optimizing the oracle(s) with respect to the input. However, many state-of-the-art predictive models, such as neural networks, are known to suffer from pathologies, especially for data far from the training distribution. Thus we need to modulate the optimization of the oracle inputs with prior knowledge about what makes `realistic' inputs (e.g., proteins that stably fold). Herein, we propose a new method to solve this problem, Conditioning by Adaptive Sampling, which yields state-of-the-art results on a protein fluorescence problem, as compared to other recently published approaches. Formally, our method achieves its success by using model-based adaptive sampling to estimate the conditional distribution of the input sequences given the desired properties.

David H. Brookes, Hahnbeom Park, Jennifer Listgarten• 2019

Related benchmarks

TaskDatasetResultRank
Offline Black-box OptimizationLLM-DM
Normalized Median Score86.4
25
Offline Black-box OptimizationAnt
Normalized Median Score0.384
25
Offline Black-box OptimizationD'Kitty
Normalized Median Score0.753
25
Offline Black-box OptimizationTF8
Normalized Median Score42.8
25
Offline Black-box OptimizationTF10
Normalized Median Score0.463
25
Offline Black-box OptimizationSuperC
Normalized Median Score11.1
25
Offline Black-box OptimizationOverall Task Suite SuperC, Ant, D’Kitty, LLM-DM, TF8, TF10
Mean Rank20.2
24
Offline Black-box OptimizationDesign-bench 100-th percentile
TFBIND8 Score92.7
20
Discrete OptimizationTF Bind 10
Median Normalized Score0.463
16
Neural Architecture SearchNAS
Median Normalized Score0.292
16
Showing 10 of 25 rows

Other info

Follow for update