Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Signs Beat Floats: Low-Rank Double-Binary Adaptation for On-Device Fine-Tuning

About

On-device adaptation of large language models commonly keeps a quantized base model frozen while training and deploying a small, task-specific LoRA adapter. In the unmerged adapter-mode setting, however, the adapter is more than a compact storage module; it introduces an additional dense floating-point branch, maintains a trainable state for local updates, and acts as a unit of communication and hot-swapping.We introduce LoRDBA, a LoRA-compatible adapter that replaces both low-rank factors with binary sign carriers while representing magnitudes through lightweight, channel-wise scales, converting the dense adapter branch into two sign-accumulation matrix multiplications interleaved with channel-wise scaling. A finite-sample analysis shows that reconstruction quality is governed by the residual-to-magnitude ratio of the original LoRA factors. In adapter-mode experiments, LoRDBA outperforms low-bit baselines at matched model sizes while matching fp16 LoRA quality in selected regimes. The unmerged adapter incurs at most 8% prefill latency overhead at matched rank r=16 despite an over 10x reduction in adapter footprint, with moderate training memory overhead of approximately 1.6x that of fp16 LoRA.

Yoshihiko Fujisawa, Yuma Ichikawa, Yudai Fujimoto, Akira Sakai, Katsuki Fujisawa• 2026

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy (Acc)53.6
337
Mathematical ReasoningMinerva Math
Accuracy12.28
104
SummarizationXsum
ROUGE-L15.55
42
Mathematical ReasoningMinerva Math
Minerva Math 4-shot Accuracy16.5
25
Mathematical ReasoningGSM8K
GSM8K 8-shot Accuracy55.04
25
MathematicsMinerva Math
4-shot Performance (%)10.04
21
Abstractive SummarizationXsum
ROUGE-L18.6
10
SummarizationXsum
ROUGE-L F17.24
9
Showing 8 of 8 rows

Other info

Follow for update