Compander-Aligned Query Geometry for Quantized Zeroth-Order Optimization

About

Low-bit forward evaluation is an attractive route to memory-efficient zeroth-order (ZO) adaptation: the optimizer needs only scalar losses, and the model can be queried near deployment precision. The obstacle is that a quantized ZO query is not a continuous finite difference followed by harmless storage rounding. The query chooses endpoints, the low-precision engine rounds them, and the loss difference is measured along the rounded chord. For nonuniform companding quantizers, this makes the codebook insufficient to predict ZO behavior: a fixed weight-space radius can collapse in dense cells, over-span sparse cells, or assign a rounded chord to an unrounded update direction. We identify the missing object as query geometry and model scalar nonuniform quantization as $Q = \phi^{-1} \circ U \circ \phi$. CAQ-ZO (Compander-Aligned Queries for Zeroth-Order Optimization) forms one-grid-step Rademacher stencils $z \pm \Delta r$ in $z = \phi(x)$, maps endpoints back through $\phi^{-1}$, and updates in $z$. Our theory proves the grid-span mismatch, decomposes endpoint-rounding estimator residuals, and gives stationarity bounds in which generic off-grid queries retain a $\Delta^2/\mu^2$ residual channel while CAQ-ZO makes the query-time residual exactly zero. Synthetic experiments isolate this channel, and matched NF4 Qwen/Llama fine-tuning shows that CAQ-ZO improves the trained NF4 baseline under the same quantizer and evaluation budget.

Yao Shu, Zilin Zhu• 2026

Related benchmarks

Task	Dataset	Result
Text Classification	BoolQ	Accuracy67.2	124
Text Classification	RTE	Accuracy61.5	110
Classification	SST2	Accuracy76.5	108
Classification	CB	Accuracy60.5	76
Generation	SQuAD	F1 Score58.6	58
Multiple-Choice	COPA	Accuracy78.4	42

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord