InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models
About
Large language models~(LLMs) are instruction followers, but it can be challenging to find the best instruction for different situations, especially for black-box LLMs on which backpropagation is forbidden. Instead of directly optimizing the discrete instruction, we optimize a low-dimensional soft prompt applied to an open-source LLM to generate the instruction for the black-box LLM. On each iteration of the proposed method, which we call InstructZero, a soft prompt is converted into an instruction using the open-source LLM, which is then submitted to the black-box LLM for zero-shot evaluation, and the performance is sent to Bayesian optimization to produce new soft prompts improving the zero-shot performance. We evaluate InstructZero on different combinations of open-source LLMs and APIs including Vicuna and ChatGPT. Our results show that InstructZero outperforms SOTA auto-instruction methods across a variety of downstream tasks. Our code and data are publicly available at https://github.com/Lichang-Chen/InstructZero.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mathematical Reasoning | SVAMP | Accuracy79.5 | 368 | |
| Mathematical Reasoning | AQUA-RAT | Accuracy54.331 | 57 | |
| Prompt Selection | Selected tasks APE-generated prompt pools APE design (test) | Averaged Performance Rank2.58 | 44 | |
| Instruction Induction | Instruction Induction (test) | Active to Passive99.7 | 10 | |
| Instruction Induction | Instruction Induction (test) | Antonyms0.827 | 6 | |
| reasoning tasks | GSM8K | Score74.299 | 5 |