Learning a Zeroth-Order Optimizer for Fine-Tuning LLMs

About

Zeroth-order optimizers have recently emerged as an attractive approach for fine-tuning large language models (LLMs), as they avoid backpropagation and can substantially reduce memory overhead relative to standard first-order training. However, existing zeroth-order methods rely on hand-crafted, static sampling strategies that are not adaptable to model-specific structures. To address this, we propose ZO-Finetuner, a learning-based zeroth-order optimizer for LLMs that automatically learns efficient perturbation strategies through a compact and memory-efficient design. Motivated by the fact that a small set of base LLMs is repeatedly fine-tuned across tasks, ZO-Finetuner supports one-time per-model training and reuse across downstream tasks with minimal overhead. Therefore, learning the optimizer once for a given LLM and reusing it across diverse downstream tasks is both feasible and highly desirable. Accordingly, ZO-Finetuner is designed to scale learning to learn (L2L) to the foundation-model era by supporting one-time per-model training with minimal overhead. Experiments on 4 LLMs and 7 datasets show that ZO-Finetuner outperforms prior zeroth-order baselines in 82.1\% of task-model combinations, thereby demonstrating strong performance and scalability for efficient LLM fine-tuning. The code can be found in https://github.com/ASTRAL-Group/ZO_Fine_tuner.

Kairun Zhang, Haoyu Li, Yanjun Zhao, Yifan Sun, Huan Zhang• 2025

Related benchmarks

Task	Dataset	Result
Common Sense Reasoning	COPA	Accuracy92	288
Question Answering	BoolQ	Accuracy66	233
Reading Comprehension	DROP	DROP Accuracy32	138
Natural Language Inference	CB	Accuracy73	129
Coreference Resolution	WSC	Accuracy57	116
Sentiment Analysis	SST-2	Top-1 Accuracy (SST-2)94	51
Mathematical Reasoning	MATH 500	Accuracy (MATH-500)54.6	33
Sentiment Analysis	SST-2	Accuracy94	26
Question Answering	SQuAD	Loss0.22	20
Reading Comprehension	DROP	Loss0.4	20

Showing 10 of 14 rows

Other info

Follow for update

@wizwand_team Discord