Towards Modular LLMs by Building and Reusing a Library of LoRAs
About
The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trained adapters to improve performance for new tasks. We study how to best build a library of adapters given multi-task data and devise techniques for both zero-shot and supervised task generalization through routing in such library. We benchmark existing approaches to build this library and introduce model-based clustering, MBC, a method that groups tasks based on the similarity of their adapter parameters, indirectly optimizing for transfer across the multi-task dataset. To re-use the library, we present a novel zero-shot routing mechanism, Arrow, which enables dynamic selection of the most relevant adapters for new inputs without the need for retraining. We experiment with several LLMs, such as Phi-2 and Mistral, on a wide array of held-out tasks, verifying that MBC-based adapters and Arrow routing lead to superior generalization to new tasks. We make steps towards creating modular, adaptable LLMs that can match or outperform traditional joint training.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Reasoning | BBH | Accuracy54.75 | 507 | |
| Reading Comprehension | BoolQ | Accuracy81.16 | 219 | |
| Reasoning | ARC Easy | Accuracy80.53 | 183 | |
| Reasoning | HellaSwag (HS) | HellaSwag Accuracy71.89 | 142 | |
| Science Question Answering | ARC-E | Accuracy83.38 | 138 | |
| Reasoning | PIQA | Accuracy80.2 | 133 | |
| Science Question Answering | ARC-C | Accuracy54.84 | 127 | |
| Reasoning | WinoGrande (WG) | Accuracy65.98 | 87 | |
| Reasoning | ARC | Accuracy53.85 | 83 | |
| Reasoning | OpenBookQA | Accuracy47.4 | 63 |