Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Large Language Models Must Be Taught to Know What They Don't Know

About

When using large language models (LLMs) in high-stakes applications, we need to know when we can trust their predictions. Some works argue that prompting high-performance LLMs is sufficient to produce calibrated uncertainties, while others introduce sampling methods that can be prohibitively expensive. In this work, we first argue that prompting on its own is insufficient to achieve good calibration and then show that fine-tuning on a small dataset of correct and incorrect answers can create an uncertainty estimate with good generalization and small computational overhead. We show that a thousand graded examples are sufficient to outperform baseline methods and that training through the features of a model is necessary for good performance and tractable for large open-source models when using LoRA. We also investigate the mechanisms that enable reliable LLM uncertainty estimation, finding that many models can be used as general-purpose uncertainty estimators, applicable not just to their own uncertainties but also the uncertainty of other models. Lastly, we show that uncertainty estimates inform human use of LLMs in human-AI collaborative settings through a user study.

Sanyam Kapoor, Nate Gruver, Manley Roberts, Katherine Collins, Arka Pal, Umang Bhatt, Adrian Weller, Samuel Dooley, Micah Goldblum, Andrew Gordon Wilson• 2024

Related benchmarks

TaskDatasetResultRank
CalibrationNQ
ECE0.6171
55
Question AnsweringPopQA
Score26.97
50
CalibrationWebQ
ECE52.27
31
CalibrationSQuAD
ECE67.03
31
Mathematical ReasoningGSM8K
Accuracy25.25
29
Knowledge Grounded DialogueWoW
F1 Score15.58
15
Slot FillingT-REx
Accuracy28.77
14
Fact VerificationFEVER
Accuracy58.7
11
Expected Calibration ErrorMusQ
ECE76.15
10
Expected Calibration ErrorPopQA
ECE61.78
10
Showing 10 of 18 rows

Other info

Code

Follow for update