Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Out-of-Distribution Detection and Selective Generation for Conditional Language Models

About

Machine learning algorithms typically assume independent and identically distributed samples in training and at test time. Much work has shown that high-performing ML classifiers can degrade significantly and provide overly-confident, wrong classification predictions, particularly for out-of-distribution (OOD) inputs. Conditional language models (CLMs) are predominantly trained to classify the next token in an output sequence, and may suffer even worse degradation on OOD inputs as the prediction is done auto-regressively over many steps. Furthermore, the space of potential low-quality outputs is larger as arbitrary text can be generated and it is important to know when to trust the generated output. We present a highly accurate and lightweight OOD detection method for CLMs, and demonstrate its effectiveness on abstractive summarization and translation. We also show how our method can be used under the common and realistic setting of distribution shift for selective generation (analogous to selective prediction for classification) of high-quality outputs, while automatically abstaining from low-quality ones, enabling safer deployment of generative language models.

Jie Ren, Jiaming Luo, Yao Zhao, Kundan Krishna, Mohammad Saleh, Balaji Lakshminarayanan, Peter J. Liu• 2022

Related benchmarks

TaskDatasetResultRank
Hallucination DetectionTriviaQA
AUROC0.7461
621
Mathematical ReasoningMATH 500
Accuracy (Acc)82.8
543
Mathematical ReasoningAIME 24
Accuracy70.42
318
Hallucination DetectionHotpotQA
AUROC0.55
249
Hallucination DetectionTriviaQA (test)
AUC-ROC83.1
243
Hallucination DetectionTruthfulQA
AUC (ROC)0.5972
178
Hallucination DetectionHaluEval (test)
AUC-ROC62.85
176
Hallucination DetectionNQ
AUC0.746
154
Hallucination DetectionHaluEval
AUROC0.5525
131
Hallucination DetectionGSM8K
AUROC76.86
115
Showing 10 of 115 rows
...

Other info

Follow for update