Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Importance-aware Co-teaching for Offline Model-based Optimization

About

Offline model-based optimization aims to find a design that maximizes a property of interest using only an offline dataset, with applications in robot, protein, and molecule design, among others. A prevalent approach is gradient ascent, where a proxy model is trained on the offline dataset and then used to optimize the design. This method suffers from an out-of-distribution issue, where the proxy is not accurate for unseen designs. To mitigate this issue, we explore using a pseudo-labeler to generate valuable data for fine-tuning the proxy. Specifically, we propose \textit{\textbf{I}mportance-aware \textbf{C}o-\textbf{T}eaching for Offline Model-based Optimization}~(\textbf{ICT}). This method maintains three symmetric proxies with their mean ensemble as the final proxy, and comprises two steps. The first step is \textit{pseudo-label-driven co-teaching}. In this step, one proxy is iteratively selected as the pseudo-labeler for designs near the current optimization point, generating pseudo-labeled data. Subsequently, a co-teaching process identifies small-loss samples as valuable data and exchanges them between the other two proxies for fine-tuning, promoting knowledge transfer. This procedure is repeated three times, with a different proxy chosen as the pseudo-labeler each time, ultimately enhancing the ensemble performance. To further improve accuracy of pseudo-labels, we perform a secondary step of \textit{meta-learning-based sample reweighting}, which assigns importance weights to samples in the pseudo-labeled dataset and updates them via meta-learning. ICT achieves state-of-the-art results across multiple design-bench tasks, achieving the best mean rank of $3.1$ and median rank of $2$, among $15$ methods. Our source code can be found here.

Ye Yuan, Can Chen, Zixuan Liu, Willie Neiswanger, Xue Liu• 2023

Related benchmarks

TaskDatasetResultRank
Offline Black-box OptimizationTF10
Normalized Median Score0.541
25
Offline Black-box OptimizationSuperC
Normalized Median Score39.9
25
Offline Black-box OptimizationAnt
Normalized Median Score0.592
25
Offline Black-box OptimizationTF8
Normalized Median Score55.1
25
Offline Black-box OptimizationD'Kitty
Normalized Median Score0.874
25
Offline Black-box OptimizationLLM-DM
Normalized Median Score83
25
Offline Black-box OptimizationOverall Task Suite SuperC, Ant, D’Kitty, LLM-DM, TF8, TF10
Mean Rank10.5
24
Offline Black-box OptimizationDesign-bench 100-th percentile
TFBIND8 Score95.8
20
Showing 8 of 8 rows

Other info

Follow for update