Astro: Activation-guided Structured Regularization for Outlier-Robust LLM Post-Training Quantization

About

Weight-only post-training quantization (PTQ) is crucial for efficient Large Language Model (LLM) deployment but suffers from accuracy degradation caused by weight and activation outliers. Existing mitigation strategies often face critical limitations: they either yield insufficient outlier suppression or incur significant deployment inefficiencies, such as inference latency, heavy preprocessing, or reliance on complex operator fusion. To resolve these limitations, we leverage a key insight: over-parameterized LLMs often converge to Flat Minima, implying a vast equivalent solution space where weights can be adjusted without compromising accuracy. Building on this, we propose Astro, an Activation-guided Structured Regularization framework designed to suppress the negative effects of outliers in a hardware-friendly and efficient manner. Leveraging the activation-guided regularization objective, Astro actively reconstructs intrinsically robust weights, aggressively suppressing weight outliers corresponding to high-magnitude activations without sacrificing model accuracy. Crucially, Astro introduces zero inference latency and is orthogonal to mainstream quantization methods like GPTQ. Extensive experiments show that Astro achieves highly competitive performance; notably, on LLaMA-2-7B, it achieves better performance than complex learning-based rotation methods with almost 1/3 of the quantization time.

Xi Chen, Ming Li, Junxi Li, Changsheng Li, Peisong Wang, Lizhong Ding, Ye Yuan, Guoren Wang• 2026

Related benchmarks

Task	Dataset	Result
Language Modeling	WikiText2	Perplexity3.38	3785
Language Understanding	MMLU 5-shot	Accuracy66.9	153
Common Sense Reasoning	HellaSwag 0-shot	Accuracy81.7	38
Question Answering	ARC-E 0-shot	Accuracy77.1	37
Common Sense Reasoning	ARC-Challenge 0-shot	Accuracy54.8	31
Question Answering	TriviaQA 5shots	Accuracy77.5	30
Common Sense Reasoning	ARC-Easy (ARC-E) 0-shot	Accuracy79.3	24
Question Answering	Natural Questions (NQ) 5-shot	Accuracy35.8	16
Question Answering	ARC-C 0-shot	Accuracy48	4
Sentence Completion	HellaSwag 0-shot	Accuracy74	4

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord