Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ADMM-Q: An Improved Hessian-based Weight Quantizer for Post-Training Quantization of Large Language Models

About

Quantization is an effective strategy to reduce the storage and computation footprint of large language models (LLMs). Post-training quantization (PTQ) is a leading approach for compressing LLMs. Popular weight quantization procedures, including GPTQ and RTN, suffer in model utility, especially at aggressive quantization levels (sub-4-bit). We propose ADMM-Q, a novel weight quantization algorithm that considers the layer-wise quantization problem. Our algorithm is based on a combinatorial variant of the Alternating Direction Method of Multipliers (ADMM). Our operator-splitting procedure updates weights continuously to minimize the layer-wise reconstruction error, while gradually enforcing the quantization constraints with convergence guarantees. We propose additional algorithmic enhancements (e.g., penalty scheduling, preconditioning, and a local search post-processing step) to make ADMM-Q efficient at LLM scale. ADMM-Q is modular and can be used as a drop-in replacement for any weight quantizer within existing quantization pipelines: ADMM-Q is fully composable with existing techniques including range clipping, learned or random rotations, and activation scaling. Using ADMM-Q in place of GPTQ on Qwen3-8B, we decrease WikiText-2 perplexity in: (i) the W3A16 weight-only setting (12.85 $\rightarrow$ 10.06); (ii) the W4A8 SmoothQuant procedure (9.29 $\rightarrow$ 8.68); and (iii) the W2A4KV4 SpinQuant procedure (66.11 $\rightarrow$ 19.42).

Ryan Lucas, Mehdi Makni, Xiang Meng, Adam Deng, Rahul Mazumder• 2026

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText-2 (test)
PPL8.68
2333
Language ModelingC4
Perplexity11.86
1688
Language ModelingPTB
Perplexity17.45
1234
Language ModelingC4 (val)
PPL15.69
737
Language ModelingPTB (test)
Perplexity23.44
543
Language ModelingWiki2
PPL6.79
326
Zero-shot Task Evaluationtasks 0-shot
Accuracy60.86
74
Zero-shot EvaluationEvaluation Benchmarks Zero-shot
Average Accuracy66.1
34
Zero-shot Evaluation0-shot
Accuracy71.07
17
Showing 9 of 9 rows

Other info

Follow for update