ExPT: Synthetic Pretraining for Few-Shot Experimental Design

About

Experimental design is a fundamental problem in many science and engineering fields. In this problem, sample efficiency is crucial due to the time, money, and safety costs of real-world design evaluations. Existing approaches either rely on active data collection or access to large, labeled datasets of past experiments, making them impractical in many real-world scenarios. In this work, we address the more challenging yet realistic setting of few-shot experimental design, where only a few labeled data points of input designs and their corresponding values are available. We approach this problem as a conditional generation task, where a model conditions on a few labeled examples and the desired output to generate an optimal input design. To this end, we introduce Experiment Pretrained Transformers (ExPT), a foundation model for few-shot experimental design that employs a novel combination of synthetic pretraining with in-context learning. In ExPT, we only assume knowledge of a finite collection of unlabelled data points from the input domain and pretrain a transformer neural network to optimize diverse synthetic functions defined over this domain. Unsupervised pretraining allows ExPT to adapt to any design task at test time in an in-context fashion by conditioning on a few labeled data points from the target task and generating the candidate optima. We evaluate ExPT on few-shot experimental design in challenging domains and demonstrate its superior generality and performance compared to existing methods. The source code is available at https://github.com/tung-nd/ExPT.git.

Tung Nguyen, Sudhanshu Agrawal, Aditya Grover• 2023

Related benchmarks

Task	Dataset	Result
Offline Black-box Optimization	Design-bench 100-th percentile	TFBIND8 Score93.3	20
Offline Model-Based Optimization	GFP	90th Percentile Oracle Score3.74	17
Offline Model-Based Optimization	TF Bind 8	90th Percentile Oracle Score48	17
Offline Model-Based Optimization	ChEMBL	90th Percentile Oracle Score0.62	17
Offline Model-Based Optimization	D'Kitty	Oracle Score (90th Pctl)0.61	17
Offline Model-Based Optimization	UTR	90th Percentile Oracle Score6.7	17
Model-Based Optimization	Design-Bench 2022 (test)	TF-Bind-8 Score0.927	16
Offline Model-Based Optimization	Branin	90th Percentile Oracle Score-23.1	16
Offline Model-Based Optimization	LogP	90th Percentile Oracle Score-16.7	16
Model-Based Optimization	Design-Bench	LogP-15.9	16

Showing 10 of 17 rows

Other info

Follow for update

@wizwand_team Discord