Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models

About

Predictive models often need to work with incomplete information in real-world tasks. Consequently, they must provide reliable probability or confidence estimation, especially in large-scale decision-making and planning tasks. Current large language models (LLMs) are insufficient for accurate estimations, but they can generate relevant factors that may affect the probabilities, produce coarse-grained probabilities when the information is more complete, and help determine which factors are relevant to specific downstream contexts. In this paper, we make use of these capabilities of LLMs to provide a significantly more accurate probabilistic estimation. We propose BIRD, a novel probabilistic inference framework that aligns a Bayesian network with LLM abductions and then estimates more accurate probabilities in a deduction step. We show BIRD provides reliable probability estimations that are 30% better than those provided directly by LLM baselines. These estimates further contribute to better and more trustworthy decision making.

Yu Feng, Ben Zhou, Weidong Lin, Dan Roth• 2024

Related benchmarks

TaskDatasetResultRank
Fact CheckingCOVID-Fact
Balanced Acc66.4
32
Three-way probability rankingCOMMON2SENSE paired
F1 (C1)59
30
Binary decisionTODAY
Accuracy72.4
27
Binary decisionBIGDATA 22
Accuracy56.2
27
Binary decisionGerman Credit
Accuracy59.2
27
Binary decisionPLASMA
Accuracy79.9
27
Binary decisionCOMMON2SENSE
Accuracy91.7
27
Fact CheckingExpertQA
Balanced Accuracy58.2
25
Pairwise Preference EvaluationCOMMON2SENSE
Context 1 Preference Score58.7
21
Fact Checkingcnn
Balanced Accuracy54
10
Showing 10 of 13 rows

Other info

Follow for update