Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Learning a Zeroth-Order Optimizer for Fine-Tuning LLMs

About

Zeroth-order optimizers have recently emerged as an attractive approach for fine-tuning large language models (LLMs), as they avoid backpropagation and can substantially reduce memory overhead relative to standard first-order training. However, existing zeroth-order methods rely on hand-crafted, static sampling strategies that are not adaptable to model-specific structures. To address this, we propose ZO-Finetuner, a learning-based zeroth-order optimizer for LLMs that automatically learns efficient perturbation strategies through a compact and memory-efficient design. Motivated by the fact that a small set of base LLMs is repeatedly fine-tuned across tasks, ZO-Finetuner supports one-time per-model training and reuse across downstream tasks with minimal overhead. Therefore, learning the optimizer once for a given LLM and reusing it across diverse downstream tasks is both feasible and highly desirable. Accordingly, ZO-Finetuner is designed to scale learning to learn (L2L) to the foundation-model era by supporting one-time per-model training with minimal overhead. Experiments on 4 LLMs and 7 datasets show that ZO-Finetuner outperforms prior zeroth-order baselines in 82.1\% of task-model combinations, thereby demonstrating strong performance and scalability for efficient LLM fine-tuning. The code can be found in https://github.com/ASTRAL-Group/ZO_Fine_tuner.

Kairun Zhang, Haoyu Li, Yanjun Zhao, Yifan Sun, Huan Zhang• 2025

Related benchmarks

TaskDatasetResultRank
Common Sense ReasoningCOPA
Accuracy92
256
Question AnsweringBoolQ
Accuracy66
201
Reading ComprehensionDROP
DROP Accuracy32
129
Natural Language InferenceCB
Accuracy73
129
Coreference ResolutionWSC
Accuracy57
116
Sentiment AnalysisSST-2
Top-1 Accuracy (SST-2)94
29
Question AnsweringSQuAD
Loss0.22
20
Reading ComprehensionDROP
Loss0.4
20
Sentiment AnalysisSST-2
Loss0.14
20
Coreference ResolutionWSC
Loss0.02
20
Showing 10 of 14 rows

Other info

Follow for update