BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models

About

Large Language Models (LLMs) often suffer from overconfidence during inference, particularly when adapted to downstream domain-specific tasks with limited data. Previous work addresses this issue by employing approximate Bayesian estimation after the LLMs are trained, enabling them to quantify uncertainty. However, such post-training approaches' performance is severely limited by the parameters learned during training. In this paper, we go beyond post-training Bayesianization and propose Bayesian Low-Rank Adaptation by Backpropagation (BLoB), an algorithm that continuously and jointly adjusts both the mean and covariance of LLM parameters throughout the whole fine-tuning process. Our empirical results verify the effectiveness of BLoB in terms of generalization and uncertainty estimation, when evaluated on both in-distribution and out-of-distribution data.

Yibin Wang, Haizhou Shi, Ligong Han, Dimitris Metaxas, Hao Wang• 2024

Related benchmarks

Task	Dataset	Result
Commonsense Reasoning	ARC Challenge	Accuracy68.81	243
Commonsense Reasoning	ARC-C	Accuracy80.081	215
Commonsense Reasoning	OBQA	Accuracy87.601	187
Commonsense Reasoning	ARC-E	Accuracy90.16	152
Commonsense Reasoning	BoolQ	Accuracy86.99	41
Multiple-choice Question Answering	ARC-C	Accuracy79.81	28
Paraphrase Detection	MRPC GLUE (val)	Accuracy0.8873	27
Natural Language Inference	RTE (val)	Accuracy0.7605	24
Extractive Question Answering	Molweni (test)	EM55.9	21
Commonsense Reasoning	WG-S	Accuracy70.89	18

Showing 10 of 47 rows

Other info

Follow for update

@wizwand_team Discord