Bayesian-LoRA: Probabilistic Low-Rank Adaptation of Large Language Models

About

Large Language Models usually put more emphasis on accuracy and therefore, will guess even when not certain about the prediction, which is especially severe when fine-tuned on small datasets due to the inherent tendency toward miscalibration. In this work, we introduce Bayesian-LoRA, which reformulates the deterministic LoRA update as a probabilistic low-rank representation inspired by Sparse Gaussian Processes. We identify a structural isomorphism between LoRA's factorization and Kronecker-factored SGP posteriors, and show that LoRA emerges as a limiting case when posterior uncertainty collapses. We conduct extensive experiments on various LLM architectures across commonsense reasoning benchmarks. With only approximately 0.42M additional parameters and ${\approx}1.2{\times}$ training cost relative to standard LoRA, Bayesian-LoRA significantly improves calibration across models up to 30B, achieving up to 84% ECE reduction and 76% NLL reduction while maintaining competitive accuracy for both in-distribution and out-of-distribution (OoD) evaluations.

Moule Lin, Shuhao Guan, Andrea Patane, David Gregg, Goetz Botterweck• 2026

Related benchmarks

Task	Dataset	Result
Commonsense Reasoning	ARC Challenge	Accuracy68.9	243
Commonsense Reasoning	ARC-E	Accuracy85.91	152
Commonsense Reasoning	BoolQ	Accuracy86.1	41
Commonsense Reasoning	WG-S	Accuracy70.9	18
Commonsense Reasoning	WG-M	Accuracy74.3	18
Mathematical Reasoning	MATH	CoT NLL0.513	11
Commonsense Reasoning	OpenBookQA	ACC81.6	9
Question Answering	OBQA in-distribution (test)	Accuracy81.6	9
Question Answering	ARC-C Small Shift (test)	Accuracy69.5	9
Question Answering	ARC-E Small Shift (test)	Accuracy78.9	9

Showing 10 of 18 rows

Other info

Follow for update

@wizwand_team Discord