Language Models with Conformal Factuality Guarantees

About

Guaranteeing the correctness and factuality of language model (LM) outputs is a major open problem. In this work, we propose conformal factuality, a framework that can ensure high probability correctness guarantees for LMs by connecting language modeling and conformal prediction. We observe that the correctness of an LM output is equivalent to an uncertainty quantification problem, where the uncertainty sets are defined as the entailment set of an LM's output. Using this connection, we show that conformal prediction in language models corresponds to a back-off algorithm that provides high probability correctness guarantees by progressively making LM outputs less specific (and expanding the associated uncertainty sets). This approach applies to any black-box LM and requires very few human-annotated samples. Evaluations of our approach on closed book QA (FActScore, NaturalQuestions) and reasoning tasks (MATH) show that our approach can provide 80-90% correctness guarantees while retaining the majority of the LM's original output.

Christopher Mohri, Tatsunori Hashimoto• 2024

Related benchmarks

Task	Dataset	Result
Reinforcement Learning from Verifiable Rewards	HEAD-QA	AR48.6	30
Long-form generation	FActScore (test)	AUROC0.6713	12
Long-form generation	PopQA (test)	AUROC0.6753	12
Distribution Shift Robustness	Sixteen Adversarial Cells MedQA + GSM8K (eval)	Violations7	10
Expert-Iteration RLVR	MedQA, HEAD-QA, ARC-C, and CaseHOLD	Pathwise Clean Score3	10
Natural Language Inference	medNLI	AR (%)76.8	10
Mathematical Reasoning	GSM8K	AR (%)9	10
Question Answering	MedQA	AR (%)28.4	9
Question Answering	CaseHold	AR (%)22.6	9
Question Answering	PubMedQA	AR (%)34.8	8

Showing 10 of 32 rows

Other info

Follow for update

@wizwand_team Discord