Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LoPRo: Enhancing Low-Rank Quantization via Permuted Block-Wise Rotation

About

Post-training quantization (PTQ) enables effective model compression while preserving relatively high accuracy. Current weight-only PTQ methods primarily focus on the challenging sub-3-bit regime, where approaches often suffer significant accuracy degradation, typically requiring fine-tuning to achieve competitive performance. In this work, we revisit the fundamental characteristics of weight quantization and analyze the challenges in quantizing the residual matrix under low-rank approximation. We propose LoPRo, a novel fine-tuning-free PTQ algorithm that enhances residual matrix quantization by applying block-wise permutation and Walsh-Hadamard transformations to rotate columns of similar importance, while explicitly preserving the quantization accuracy of the most salient column blocks. Furthermore, we introduce a mixed-precision fast low-rank decomposition based on rank-1 sketch (R1SVD) to further minimize quantization costs. Experiments demonstrate that LoPRo outperforms existing fine-tuning-free PTQ methods at both 2-bit and 3-bit quantization, achieving accuracy comparable to fine-tuning baselines. Specifically, LoPRo achieves state-of-the-art quantization accuracy on LLaMA-2 and LLaMA-3 series models while delivering up to a 4$\times$ speedup. In the MoE model Mixtral-8x7B, LoPRo completes quantization within 2.5 hours, simultaneously reducing perplexity by 0.4$\downarrow$ and improving accuracy by 8\%$\uparrow$. Moreover, compared to other low-rank quantization methods, LoPRo achieves superior accuracy with a significantly lower rank, while maintaining high inference efficiency and minimal additional latency.

Hongyaoxing Gu, Lijuan Hu, Liye Yu, Haowei Li, Fangfang Liu• 2026

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText2
Perplexity4.15
1875
Zero-shot Question Answering and ReasoningAccuracy Tasks Zero-shot (AC, AE, WI, QA)
AC Score61
52
Large Language Model EvaluationOpen LLM Leaderboard v1 (test)
Average Score66.1
14
Language ModelingLLaMA-2 Family Evaluation v2 (test)
PPL4.8
10
Zero-shot ClassificationZero-shot Evaluation Suite (AC, AE, WI, QA) v1
AC Score46.2
10
Showing 5 of 5 rows

Other info

Follow for update