LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models
About
The success of large language models (LLMs), like GPT-4 and ChatGPT, has led to the development of numerous cost-effective and accessible alternatives that are created by finetuning open-access LLMs with task-specific data (e.g., ChatDoctor) or instruction data (e.g., Alpaca). Among the various fine-tuning methods, adapter-based parameter-efficient fine-tuning (PEFT) is undoubtedly one of the most attractive topics, as it only requires fine-tuning a few external parameters instead of the entire LLMs while achieving comparable or even better performance. To enable further research on PEFT methods of LLMs, this paper presents LLM-Adapters, an easy-to-use framework that integrates various adapters into LLMs and can execute these adapter-based PEFT methods of LLMs for different tasks. The framework includes state-of-the-art open-access LLMs such as LLaMA, BLOOM, and GPT-J, as well as widely used adapters such as Series adapters, Parallel adapter, Prompt-based learning and Reparametrization-based methods. Moreover, we conduct extensive empirical studies on the impact of adapter types, placement locations, and hyper-parameters to the best design for each adapter-based methods. We evaluate the effectiveness of the adapters on fourteen datasets from two different reasoning tasks, Arithmetic Reasoning and Commonsense Reasoning. The results demonstrate that using adapter-based PEFT in smaller-scale LLMs (7B) with few extra trainable parameters yields comparable, and in some cases superior, performance to powerful LLMs (175B) in zero-shot inference on both reasoning tasks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CIFAR-100 (test) | Accuracy87.97 | 3518 | |
| Image Classification | CIFAR-10 (test) | Accuracy97.94 | 3381 | |
| Mathematical Reasoning | GSM8K (test) | Accuracy60.8 | 797 | |
| Natural Language Understanding | GLUE (dev) | SST-2 (Acc)96 | 504 | |
| Commonsense Reasoning | Common Sense Reasoning Tasks | Avg Score77 | 241 | |
| Instruction Following | MT-Bench | MT-Bench Score5.7 | 189 | |
| Commonsense Reasoning | Commonsense Reasoning (BoolQ, PIQA, SIQA, HellaS., WinoG., ARC-e, ARC-c, OBQA) (test) | BoolQ Accuracy88 | 138 | |
| Arithmetic Reasoning | GSM8K (test) | Accuracy47.5 | 129 | |
| Image Classification | VTAB 1k (test) | -- | 121 | |
| Image Classification | Food101 (test) | Accuracy84.27 | 87 |