Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models

About

Large Language Models (LLMs) often suffer from overconfidence during inference, particularly when adapted to downstream domain-specific tasks with limited data. Previous work addresses this issue by employing approximate Bayesian estimation after the LLMs are trained, enabling them to quantify uncertainty. However, such post-training approaches' performance is severely limited by the parameters learned during training. In this paper, we go beyond post-training Bayesianization and propose Bayesian Low-Rank Adaptation by Backpropagation (BLoB), an algorithm that continuously and jointly adjusts both the mean and covariance of LLM parameters throughout the whole fine-tuning process. Our empirical results verify the effectiveness of BLoB in terms of generalization and uncertainty estimation, when evaluated on both in-distribution and out-of-distribution data.

Yibin Wang, Haizhou Shi, Ligong Han, Dimitris Metaxas, Hao Wang• 2024

Related benchmarks

TaskDatasetResultRank
Commonsense ReasoningARC Challenge
Accuracy68.81
190
Commonsense ReasoningARC-C
Accuracy79.42
172
Commonsense ReasoningOBQA
Accuracy82.73
117
Commonsense ReasoningARC-E
Accuracy90.16
106
Paraphrase DetectionMRPC GLUE (val)
Accuracy0.8873
27
Natural Language InferenceRTE (val)
Accuracy0.7605
24
Commonsense ReasoningWG-S
Accuracy70.89
18
Commonsense ReasoningBoolQ
Accuracy86.99
18
Commonsense ReasoningWG-M
Accuracy74.55
18
Common Sense ReasoningWinoGrande S In-Distribution Llama-3.1-8B (train test)
Accuracy72.36
15
Showing 10 of 37 rows

Other info

Follow for update