Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

GPTAQ: Efficient Finetuning-Free Quantization for Asymmetric Calibration

About

We introduce GPTAQ, a novel finetuning-free quantization method for compressing large-scale transformer architectures. Unlike the previous GPTQ method, which independently calibrates each layer, we always match the quantized layer's output to the exact output in the full-precision model, resulting in a scheme that we call asymmetric calibration. Such a scheme can effectively reduce the quantization error accumulated in previous layers. We analyze this problem using optimal brain compression to derive a close-formed solution. The new solution explicitly minimizes the quantization error as well as the accumulated asymmetry error. Furthermore, we utilize various techniques to parallelize the solution calculation, including channel parallelization, neuron decomposition, and Cholesky reformulation for matrix fusion. As a result, GPTAQ is easy to implement, simply using 20 more lines of code than GPTQ but improving its performance under low-bit quantization. Remarkably, on a single GPU, we quantize a 405B language transformer as well as EVA-02, the rank first vision transformer that achieves 90% pretraining Imagenet accuracy. Code is available at Github.

Yuhang Li, Ruokai Yin, Donghyun Lee, Shiting Xiao, Priyadarshini Panda• 2025

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText2
Perplexity3.47
1875
Language ModelingWikiText-2 (test)
PPL5.01
1541
Language ModelingC4
Perplexity5.62
1182
Zero-shot EvaluationDownstream Tasks Zero-shot
Accuracy72.7
278
Language ModelingC4 (test)
Perplexity10.97
268
Zero-shot EvaluationEight datasets average
Accuracy60.92
87
Language ModelingC4
C4 Loss6.57
73
Zero-shot performance evaluationLM Eval Harness (HellaSwag, BoolQ, WinoGrande, PiQA, ARC-easy, ARC-challenge) zero-shot
Mean Accuracy73.23
60
Zero-shot ReasoningARC-e, Winogrande, HellaSwag, PIQA
Normalized Avg Accuracy47.3
36
Zero-shot EvaluationEvaluation Tasks Zero-shot Aggregate
Avg. Accuracy71
33
Showing 10 of 11 rows

Other info

Follow for update