BEFT: Bias-Efficient Fine-Tuning of Language Models in Low-Data Regimes

About

Fine-tuning the bias terms of large language models (LLMs) has the potential to achieve unprecedented parameter efficiency while maintaining competitive performance, particularly in low-data regimes. However, the link between fine-tuning different bias terms (i.e., $\boldsymbol{b}_q$, $\boldsymbol{b}_k$, and $\boldsymbol{b}_v$ in the query, key, or value projections) and downstream performance remains largely unclear to date. In this paper, we investigate the link between fine-tuning $\boldsymbol{b}_q$, $\boldsymbol{b}_k$, and $\boldsymbol{b}_v$ with the performance of the downstream task. Our key finding is that directly fine-tuning $\boldsymbol{b}_v$ generally leads to higher downstream performance in low-data regimes, in comparison to $\boldsymbol{b}_q$ and $\boldsymbol{b}_k$. We extensively evaluate this unique property across a wide range of LLMs spanning encoder-only and decoder-only architectures up to 6.7B parameters (including bias-free LLMs). Our results provide strong evidence for the effectiveness of directly fine-tuning $\boldsymbol{b}_v$ across various downstream tasks. The implementation code is available at https://github.com/whubaichuan/BEFT.

Baichuan Huang, Ananth Balashankar, Amir Aminifar• 2025

Related benchmarks

Task	Dataset	Result
Generation	DROP	F1 Score32.4	49
Classification	GLUE	SST-2 Accuracy95.2	14
Classification	SuperGLUE	CB Accuracy96.4	14
Multiple-Choice	SuperGLUE	COPA Score83	14
Natural Language Inference	RTE low-data regime GLUE	Accuracy58.53	4

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord