Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Compander-Aligned Query Geometry for Quantized Zeroth-Order Optimization

About

Low-bit forward evaluation is an attractive route to memory-efficient zeroth-order (ZO) adaptation: the optimizer needs only scalar losses, and the model can be queried near deployment precision. The obstacle is that a quantized ZO query is not a continuous finite difference followed by harmless storage rounding. The query chooses endpoints, the low-precision engine rounds them, and the loss difference is measured along the rounded chord. For nonuniform companding quantizers, this makes the codebook insufficient to predict ZO behavior: a fixed weight-space radius can collapse in dense cells, over-span sparse cells, or assign a rounded chord to an unrounded update direction. We identify the missing object as query geometry and model scalar nonuniform quantization as $Q = \phi^{-1} \circ U \circ \phi$. CAQ-ZO (Compander-Aligned Queries for Zeroth-Order Optimization) forms one-grid-step Rademacher stencils $z \pm \Delta r$ in $z = \phi(x)$, maps endpoints back through $\phi^{-1}$, and updates in $z$. Our theory proves the grid-span mismatch, decomposes endpoint-rounding estimator residuals, and gives stationarity bounds in which generic off-grid queries retain a $\Delta^2/\mu^2$ residual channel while CAQ-ZO makes the query-time residual exactly zero. Synthetic experiments isolate this channel, and matched NF4 Qwen/Llama fine-tuning shows that CAQ-ZO improves the trained NF4 baseline under the same quantizer and evaluation budget.

Yao Shu, Zilin Zhu• 2026

Related benchmarks

TaskDatasetResultRank
Text ClassificationBoolQ
Accuracy67.2
118
Text ClassificationRTE
Accuracy61.5
104
ClassificationSST2
Accuracy76.5
102
ClassificationCB
Accuracy60.5
70
GenerationSQuAD
F1 Score58.6
52
Multiple-ChoiceCOPA
Accuracy78.4
36
Showing 6 of 6 rows

Other info

Follow for update