Improving LoRA in Privacy-preserving Federated Learning
About
Low-rank adaptation (LoRA) is one of the most popular task-specific parameter-efficient fine-tuning (PEFT) methods on pre-trained language models for its good performance and computational efficiency. LoRA injects a product of two trainable rank decomposition matrices over the top of each frozen pre-trained model module. However, when applied in the setting of privacy-preserving federated learning (FL), LoRA may become unstable due to the following facts: 1) the effects of data heterogeneity and multi-step local updates are non-negligible, 2) additive noise enforced on updating gradients to guarantee differential privacy (DP) can be amplified and 3) the final performance is susceptible to hyper-parameters. A key factor leading to these phenomena is the discordance between jointly optimizing the two low-rank matrices by local clients and separately aggregating them by the central server. Thus, this paper proposes an efficient and effective version of LoRA, Federated Freeze A LoRA (FFA-LoRA), to alleviate these challenges and further halve the communication cost of federated fine-tuning LLMs. The core idea of FFA-LoRA is to fix the randomly initialized non-zero matrices and only fine-tune the zero-initialized matrices. Compared to LoRA, FFA-LoRA is motivated by practical and theoretical benefits in privacy-preserved FL. Our experiments demonstrate that FFA-LoRA provides more consistent performance with better computational efficiency over vanilla LoRA in various FL tasks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | Tiny ImageNet (test) | Accuracy44.62 | 362 | |
| Math Reasoning | GSM8K (test) | Accuracy25.4 | 192 | |
| Natural Language Understanding | GLUE (val) | SST-295.64 | 191 | |
| Commonsense Reasoning | Commonsense Reasoning (BoolQ, PIQA, SIQA, HellaS., WinoG., ARC-e, ARC-c, OBQA) | BoolQ Accuracy82.88 | 129 | |
| Question Answering | SQuAD (test) | F191.07 | 111 | |
| Question Answering | SQuAD v1.1 | F190.31 | 79 | |
| Commonsense Reasoning | Commonsense Reasoning Suite (test) | HellaSwag Accuracy0.5773 | 62 | |
| Paraphrase Detection | QQP (test) | Accuracy88.51 | 51 | |
| Commonsense Reasoning | COPA (test) | Accuracy89 | 46 | |
| Code Generation | HumanEval and MBPP | Overall Average Score20.33 | 37 |