Nonlinear Bipolar Compensation: Handling Outliers in Post-Training Quantization
About
Network quantization has emerged as one of the most practical model compression techniques, which significantly reduces a model's memory and compute consumption by mapping floating-point numbers to low-bit representations. However, existing quantization methods typically suffer from the speed-accuracy tradeoff and limited generalization. To address these issues, recent compensation-based methods offer an efficient yet general solution by introducing additional lightweight linear layers into the quantized network. However, the accuracy of these methods suffers from their limited compensation capability and high sensitivity to outliers. In this paper, we propose Nonlinear Bipolar Compensation (NBC), a post-training quantization approach that introduces nonlinear compensation to reduce the effect of outliers. We further design Bipolar Logarithmic Transformation (BLT), which compresses outliers by mapping both the quantized input and the quantization error into a transformed space. A simple linear layer is then applied for compensation in the transformed space, preserving the efficiency of our method. Extensive experiments across various tasks, models, and quantization methods confirm the effectiveness, efficiency, robustness, and generality of our NBC approach.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | ImageNet-1k (val) | Top-1 Accuracy83.9 | 1498 | |
| Language Generation | WikiText2 | Perplexity6.03 | 287 | |
| Language Generation | C4 | Perplexity7.69 | 190 | |
| Image Generation | ImageNet | FID5.41 | 101 | |
| Object Detection | COCO | AP^b51.7 | 49 | |
| Zero-shot Question Answering and Commonsense Reasoning | Zero-shot Downstream Tasks (ARC, HellaSwag, WinoGrande, BoolQ, PiQA) | Average Accuracy (Zero-Shot)61.47 | 48 | |
| Instance Segmentation | COCO | mAPmask50-9544.9 | 43 | |
| Zero-shot Classification | ImageNet | Top-1 Clean Accuracy0.631 | 20 | |
| Image Classification | ImageNet 32 (val) | Top-1 Accuracy81.2 | 12 |