ADMM-Q: An Improved Hessian-based Weight Quantizer for Post-Training Quantization of Large Language Models

About

Quantization is an effective strategy to reduce the storage and computation footprint of large language models (LLMs). Post-training quantization (PTQ) is a leading approach for compressing LLMs. Popular weight quantization procedures, including GPTQ and RTN, suffer in model utility, especially at aggressive quantization levels (sub-4-bit). We propose ADMM-Q, a novel weight quantization algorithm that considers the layer-wise quantization problem. Our algorithm is based on a combinatorial variant of the Alternating Direction Method of Multipliers (ADMM). Our operator-splitting procedure updates weights continuously to minimize the layer-wise reconstruction error, while gradually enforcing the quantization constraints with convergence guarantees. We propose additional algorithmic enhancements (e.g., penalty scheduling, preconditioning, and a local search post-processing step) to make ADMM-Q efficient at LLM scale. ADMM-Q is modular and can be used as a drop-in replacement for any weight quantizer within existing quantization pipelines: ADMM-Q is fully composable with existing techniques including range clipping, learned or random rotations, and activation scaling. Using ADMM-Q in place of GPTQ on Qwen3-8B, we decrease WikiText-2 perplexity in: (i) the W3A16 weight-only setting (12.85 $\rightarrow$ 10.06); (ii) the W4A8 SmoothQuant procedure (9.29 $\rightarrow$ 8.68); and (iii) the W2A4KV4 SpinQuant procedure (66.11 $\rightarrow$ 19.42).

Ryan Lucas, Mehdi Makni, Xiang Meng, Adam Deng, Rahul Mazumder• 2026

Related benchmarks

Task	Dataset	Result
Language Modeling	WikiText-2 (test)	PPL8.68	2416
Language Modeling	C4	Perplexity11.86	1688
Language Modeling	PTB	Perplexity17.45	1234
Language Modeling	C4 (val)	PPL15.69	908
Language Modeling	PTB (test)	Perplexity23.44	543
Language Modeling	Wiki2	PPL6.79	382
Zero-shot Task Evaluation	tasks 0-shot	Accuracy60.86	83
Zero-shot Evaluation	Evaluation Benchmarks Zero-shot	Average Accuracy66.1	55
Zero-shot Evaluation	0-shot	Accuracy71.07	17

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord