Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Astro: Activation-guided Structured Regularization for Outlier-Robust LLM Post-Training Quantization

About

Weight-only post-training quantization (PTQ) is crucial for efficient Large Language Model (LLM) deployment but suffers from accuracy degradation caused by weight and activation outliers. Existing mitigation strategies often face critical limitations: they either yield insufficient outlier suppression or incur significant deployment inefficiencies, such as inference latency, heavy preprocessing, or reliance on complex operator fusion. To resolve these limitations, we leverage a key insight: over-parameterized LLMs often converge to Flat Minima, implying a vast equivalent solution space where weights can be adjusted without compromising accuracy. Building on this, we propose Astro, an Activation-guided Structured Regularization framework designed to suppress the negative effects of outliers in a hardware-friendly and efficient manner. Leveraging the activation-guided regularization objective, Astro actively reconstructs intrinsically robust weights, aggressively suppressing weight outliers corresponding to high-magnitude activations without sacrificing model accuracy. Crucially, Astro introduces zero inference latency and is orthogonal to mainstream quantization methods like GPTQ. Extensive experiments show that Astro achieves highly competitive performance; notably, on LLaMA-2-7B, it achieves better performance than complex learning-based rotation methods with almost 1/3 of the quantization time.

Xi Chen, Ming Li, Junxi Li, Changsheng Li, Peisong Wang, Lizhong Ding, Ye Yuan, Guoren Wang• 2026

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText2
Perplexity3.38
1875
Language UnderstandingMMLU 5-shot
Accuracy66.9
132
Question AnsweringTriviaQA 5shots
Accuracy77.5
30
Question AnsweringARC-E 0-shot
Accuracy77.1
29
Common Sense ReasoningHellaSwag 0-shot
Accuracy81.7
22
Common Sense ReasoningARC-Challenge 0-shot
Accuracy54.8
19
Question AnsweringNatural Questions (NQ) 5-shot
Accuracy35.8
16
Common Sense ReasoningARC-Easy (ARC-E) 0-shot
Accuracy79.3
12
Question AnsweringARC-C 0-shot
Accuracy48
4
Sentence CompletionHellaSwag 0-shot
Accuracy74
4
Showing 10 of 10 rows

Other info

Follow for update