Offline Model-Based Optimization via Normalized Maximum Likelihood Estimation
About
In this work we consider data-driven optimization problems where one must maximize a function given only queries at a fixed set of points. This problem setting emerges in many domains where function evaluation is a complex and expensive process, such as in the design of materials, vehicles, or neural network architectures. Because the available data typically only covers a small manifold of the possible space of inputs, a principal challenge is to be able to construct algorithms that can reason about uncertainty and out-of-distribution values, since a naive optimizer can easily exploit an estimated model to return adversarial inputs. We propose to tackle this problem by leveraging the normalized maximum-likelihood (NML) estimator, which provides a principled approach to handling uncertainty and out-of-distribution inputs. While in the standard formulation NML is intractable, we propose a tractable approximation that allows us to scale our method to high-capacity neural network models. We demonstrate that our method can effectively optimize high-dimensional design problems in a variety of disciplines such as chemistry, biology, and materials engineering.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Neural Architecture Search | NAS | Median Normalized Score0.568 | 16 | |
| Offline Model-Based Optimization | Ant Morphology (test) | Median Normalized Score0.593 | 16 | |
| Offline Model-Based Optimization | D'Kitty Morphology (test) | Median Normalized Score0.885 | 16 | |
| Offline Model-Based Optimization | Hopper Controller (test) | Median Normalized Score0.361 | 16 | |
| Discrete Optimization | TF Bind 8 | Median Normalized Score43.9 | 16 | |
| Offline Model-Based Optimization | Superconductor (test) | Median Normalized Score0.322 | 16 | |
| Discrete Optimization | TF Bind 10 | Median Normalized Score0.456 | 16 | |
| Offline Model-Based Optimization | Design-bench 100th percentile v1 (test) | GFP Score3.359 | 7 | |
| Offline Model-Based Optimization | Design-bench (test) | GFP Score3.219 | 6 |