Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Training-Free Bayesianization for Low-Rank Adapters of Large Language Models

About

Estimating the uncertainty of responses from Large Language Models (LLMs) remains a critical challenge. While recent Bayesian methods have demonstrated effectiveness in quantifying uncertainty through low-rank weight updates, they typically require complex fine-tuning or post-training procedures. In this paper, we propose Training-Free Bayesianization (TFB), a simple yet theoretically grounded framework that efficiently transforms trained low-rank adapters into Bayesian ones without additional training. TFB systematically searches for the maximally acceptable level of variance in the weight posterior, constrained within a family of low-rank isotropic Gaussian distributions. Our theoretical analysis shows that under mild conditions, this search process is equivalent to KL-regularized variational optimization, a generalized form of variational inference. Through comprehensive experiments, we show that TFB achieves superior uncertainty estimation and generalization compared to existing methods while eliminating the need for complex Bayesianization training procedures. Code will be available at https://github.com/Wang-ML-Lab/bayesian-peft.

Haizhou Shi, Yibin Wang, Ligong Han, Huan Zhang, Hao Wang• 2024

Related benchmarks

TaskDatasetResultRank
Commonsense ReasoningARC-C
Accuracy80.58
172
Commonsense ReasoningARC-E
Accuracy91.2
106
Common Sense ReasoningWinoGrande S In-Distribution Llama-3.1-8B (train test)
Accuracy76.25
15
Common Sense ReasoningARC-C OOD Small Shift
Accuracy80.97
14
Common Sense ReasoningChem OOD Large Shift
Accuracy47.33
14
Uncertainty QuantificationARC-E
Training Memory (MB)2.10e+4
14
Common Sense ReasoningARC-E OOD Small Shift
Accuracy85.74
12
Common Sense ReasoningPhy OOD Large Shift
Accuracy48.83
12
Common Sense ReasoningOBQA In-Distribution
Accuracy88.3
12
Common Sense ReasoningWG-M (In-Distribution)
Accuracy82.7
12
Showing 10 of 12 rows

Other info

Follow for update