Signs Beat Floats: Low-Rank Double-Binary Adaptation for On-Device Fine-Tuning
About
On-device adaptation of large language models commonly keeps a quantized base model frozen while training and deploying a small, task-specific LoRA adapter. In the unmerged adapter-mode setting, however, the adapter is more than a compact storage module; it introduces an additional dense floating-point branch, maintains a trainable state for local updates, and acts as a unit of communication and hot-swapping.We introduce LoRDBA, a LoRA-compatible adapter that replaces both low-rank factors with binary sign carriers while representing magnitudes through lightweight, channel-wise scales, converting the dense adapter branch into two sign-accumulation matrix multiplications interleaved with channel-wise scaling. A finite-sample analysis shows that reconstruction quality is governed by the residual-to-magnitude ratio of the original LoRA factors. In adapter-mode experiments, LoRDBA outperforms low-bit baselines at matched model sizes while matching fp16 LoRA quality in selected regimes. The unmerged adapter incurs at most 8% prefill latency overhead at matched rank r=16 despite an over 10x reduction in adapter footprint, with moderate training memory overhead of approximately 1.6x that of fp16 LoRA.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mathematical Reasoning | GSM8K | Accuracy (Acc)53.6 | 337 | |
| Mathematical Reasoning | Minerva Math | Accuracy12.28 | 104 | |
| Summarization | Xsum | ROUGE-L15.55 | 42 | |
| Mathematical Reasoning | Minerva Math | Minerva Math 4-shot Accuracy16.5 | 25 | |
| Mathematical Reasoning | GSM8K | GSM8K 8-shot Accuracy55.04 | 25 | |
| Mathematics | Minerva Math | 4-shot Performance (%)10.04 | 21 | |
| Abstractive Summarization | Xsum | ROUGE-L18.6 | 10 | |
| Summarization | Xsum | ROUGE-L F17.24 | 9 |