Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DAQ: Delta-Aware Quantization for Post-Training LLM Weight Compression

About

We introduce Delta-Aware Quantization (DAQ), a data-free post-training quantization framework that preserves the knowledge acquired during post-training. Standard quantization objectives minimize reconstruction error but are agnostic to the base model, allowing quantization noise to disproportionately corrupt the small-magnitude parameter deltas ($\Delta W$) that encode post-training behavior -- an effect we analyze through the lens of quantization as implicit regularization. DAQ replaces reconstruction-based objectives with two delta-aware metrics -- Sign Preservation Rate and Cosine Similarity -- that directly optimize for directional fidelity of $\Delta W$, requiring only the base and post-trained weight matrices. In a pilot FP8 study, DAQ recovers style-specific capabilities lost under standard quantization while maintaining general performance.

Xiaoming Yu, Shize Tang, Guanghua Yu, Linchuan Xie, Song Liu, Jianchen Zhu, Feng Li• 2026

Related benchmarks

TaskDatasetResultRank
Dialogue Style EvaluationSFT Dialogue--
6
General Capability EvaluationGeneral Capability Dataset--
6
Weight Reconstruction FidelityDeepSeek-V3 Weights--
3
Showing 3 of 3 rows

Other info

Follow for update