Know Your Limits: A Survey of Abstention in Large Language Models

About

Abstention, the refusal of large language models (LLMs) to provide an answer, is increasingly recognized for its potential to mitigate hallucinations and enhance safety in LLM systems. In this survey, we introduce a framework to examine abstention from three perspectives: the query, the model, and human values. We organize the literature on abstention methods, benchmarks, and evaluation metrics using this framework, and discuss merits and limitations of prior work. We further identify and motivate areas for future research, such as whether abstention can be achieved as a meta-capability that transcends specific tasks or domains, and opportunities to optimize abstention abilities in specific contexts. In doing so, we aim to broaden the scope and impact of abstention methodologies in AI systems.

Bingbing Wen, Jihan Yao, Shangbin Feng, Chenjun Xu, Yulia Tsvetkov, Bill Howe, Lucy Lu Wang• 2024

Related benchmarks

Task	Dataset	Result	Rank
Agentic Task Performance	τ2-Bench Airline 1.0 (test)	CAP95.3		48
Agentic Task Performance	τ2-Bench Retail 1.0 (test)	Completion Accuracy (CAP)89		48

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord