Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LBLLM: Lightweight Binarization of Large Language Models via Three-Stage Distillation

About

Deploying large language models (LLMs) in resource-constrained environments is hindered by heavy computational and memory requirements. We present LBLLM, a lightweight binarization framework that achieves effective W(1+1)A4 quantization through a novel three-stage quantization strategy. The framework proceeds as follows: (1) initialize a high-quality quantized model via PTQ; (2) quantize binarized weights, group-wise bitmaps, and quantization parameters through layer-wise distillation while keeping activations in full precision; and (3) training learnable activation quantization factors to dynamically quantize activations to 4 bits. This decoupled design mitigates interference between weight and activation quantization, yielding greater training stability and better inference accuracy. LBLLM, trained only using 0.016B tokens with a single GPU, surpasses existing state-of-the-art binarization methods on W2A4 quantization settings across tasks of language modeling, commonsense QA, and language understanding. These results demonstrate that extreme low-bit quantization of LLMs can be both practical and highly effective without introducing any extra high-precision channels or rotational matrices commonly used in recent PTQ-based works, offering a promising path toward efficient LLM deployment in resource-limited situations.

Siqing Song, Chuang Wang, Yong Lang, Yi Yang, Xu-Yao Zhang• 2026

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText-2 (test)
PPL9.18
2333
Language ModelingWikiText-2
Perplexity (PPL)9.08
2320
Language ModelingC4
Perplexity10.42
1688
Language ModelingPTB
Perplexity26.4
1234
Language ModelingC4 (val)
PPL10.91
737
Language ModelingWiki
Perplexity (PPL)18.53
298
Common Sense ReasoningBoolQ
Accuracy70.28
240
ReasoningARC Easy
Accuracy53.11
233
ReasoningHellaSwag (HS)
HellaSwag Accuracy60.05
209
Multiple-choice Question AnsweringHellaSwag
Accuracy58.54
196
Showing 10 of 31 rows

Other info

Follow for update