Generative Adapter: Contextualizing Language Models in Parameters with A Single Forward Pass
About
Large language models (LMs) are typically adapted to improve performance on new contexts (\eg text prompts that define new tasks or domains) through fine-tuning or prompting. However, there is an accuracy compute tradeoff -- fine-tuning incurs significant training cost and prompting increases inference overhead. We introduce $GenerativeAdapter$, an effective and efficient adaptation method that directly maps new contexts to low-rank LM adapters, thereby significantly reducing inference overhead with no need for finetuning. The adapter generator is trained via self-supervised learning, and can be used to adapt a single frozen LM for any new task simply by mapping the associated task or domain context to a new adapter. We apply $GenerativeAdapter$ to two pretrained LMs (Mistral-7B-Instruct and Llama2-7B-Chat) and evaluate the adapted models in three adaption scenarios: knowledge acquisition from documents, learning from demonstrations, and personalization for users. In StreamingQA, our approach is effective in injecting knowledge into the LM's parameters, achieving a 63.5% improvement in F1 score over the model with supervised fine-tuning (from $19.5$ to $31.5$) for contexts as long as 32K tokens. In the MetaICL in-context learning evaluation, our method achieves an average accuracy of $44.9$ across 26 tasks, outperforming the base model. On MSC, our method proves to be highly competitive in memorizing user information from conversations with a 4x reduction in computation and memory costs compared to prompting with full conversation history. Together, these results suggest that $GenerativeAdapter$ should allow for general adaption to a wide range of different contexts.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multi-hop Question Answering | 2WikiMultihopQA | -- | 278 | |
| Multi-hop Question Answering | HotpotQA | F1 Score40.8 | 221 | |
| Single-hop Question Answering | SQuAD | F1 Score70.3 | 21 | |
| Long-context Question Answering | SQuAD 2K | Answer F1 Score39.9 | 6 | |
| Long-context Question Answering | SQuAD 1K | Answer F143 | 6 | |
| Single-hop Question Answering | MS MARCO V1 | F1 Score (Answer)35 | 6 | |
| Long-context Question Answering | SQuAD 512 | Answer F10.488 | 6 | |
| Multi-hop Question Answering | MuSiQue | Answer F119.4 | 6 | |
| Single-hop Question Answering | MS MARCO V2 | Answer F1 Score27.9 | 6 | |
| Question Answering | SQuAD | ROUGE-L Recall64.3 | 3 |