LICO: Large Language Models for In-Context Molecular Optimization

About

Optimizing black-box functions is a fundamental problem in science and engineering. To solve this problem, many approaches learn a surrogate function that estimates the underlying objective from limited historical evaluations. Large Language Models (LLMs), with their strong pattern-matching capabilities via pretraining on vast amounts of data, stand out as a potential candidate for surrogate modeling. However, directly prompting a pretrained language model to produce predictions is not feasible in many scientific domains due to the scarcity of domain-specific data in the pretraining corpora and the challenges of articulating complex problems in natural language. In this work, we introduce LICO, a general-purpose model that extends arbitrary base LLMs for black-box optimization, with a particular application to the molecular domain. To achieve this, we equip the language model with a separate embedding layer and prediction layer, and train the model to perform in-context predictions on a diverse set of functions defined over the domain. Once trained, LICO can generalize to unseen molecule properties simply via in-context prompting. LICO performs competitively on PMO, a challenging molecular optimization benchmark comprising 23 objective functions, and achieves state-of-the-art performance on its low-budget version PMO-1K.

Tung Nguyen, Aditya Grover• 2024

Related benchmarks

Task	Dataset	Result
Molecular Optimization	Practical Molecular Optimization (PMO)	Sum AUC top-1011.71	37
Goal-directed molecular optimization	PMO	Amlodipine MPO0.541	24
Bioactivity-guided Molecule Generation	PMO-1K GSK3β	Top-10 AUC0.617	13
Bioactivity-guided Molecule Generation	PMO-1K JNK3	Top-10 AUC0.336	13
Bioactivity-guided Molecule Generation	PMO-1K DRD2	Top-10 AUC85.9	13
Bioactivity	PMO-1K	Bioactivity (GSK3β)0.617	12
Multi property optimization	PMO-1K	Amlo. Score54.1	12
Rediscovery	PMO-1K	Cele. Score0.447	12
Molecular Optimization	PMO-1K	Aggregate Score (22 Tasks)11.71	8

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord