Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models

About

Generation of plausible but incorrect factual information, often termed hallucination, has attracted significant research interest. Retrieval-augmented language model (RALM) -- which enhances models with up-to-date knowledge -- emerges as a promising method to reduce hallucination. However, existing RALMs may instead exacerbate hallucination when retrieving lengthy contexts. To address this challenge, we propose COFT, a novel \textbf{CO}arse-to-\textbf{F}ine highligh\textbf{T}ing method to focus on different granularity-level key texts, thereby avoiding getting lost in lengthy contexts. Specifically, COFT consists of three components: \textit{recaller}, \textit{scorer}, and \textit{selector}. First, \textit{recaller} applies a knowledge graph to extract potential key entities in a given context. Second, \textit{scorer} measures the importance of each entity by calculating its contextual weight. Finally, \textit{selector} selects high contextual weight entities with a dynamic threshold algorithm and highlights the corresponding paragraphs, sentences, or words in a coarse-to-fine manner. Extensive experiments on the knowledge hallucination benchmark demonstrate the effectiveness of COFT, leading to a superior performance over $30\%$ in the F1 score metric. Moreover, COFT also exhibits remarkable versatility across various long-form tasks, such as reading comprehension and question answering.

Qitan Lv, Jie Wang, Hanzhu Chen, Bin Li, Yongdong Zhang, Feng Wu• 2024

Related benchmarks

Task	Dataset	Result
Question Answering	2Wiki	EM45.5	260
Multi-hop Question Answering	HotpotQA	LLM Judge Score61.71	72
Question Answering	Bamboogle	EM40.3	61
Question Answering	MuSiQue	EM18.5	57
Multi-hop Question Answering	2Wiki	EM41.86	16
Multi-hop Question Answering	Bamboogle	EM35.71	16
Multi-hop Question Answering	MuSiQue	EM17.12	16
Multi-question Reasoning	MuSiQue-3Q	Exact Match (EM)12.6	6
Multi-question Reasoning	HotpotQA 3Q	Exact Match Accuracy (3Q)24.5	6
Multi-question Reasoning	2Wiki-3Q	Exact Match (EM)27.2	6

Showing 10 of 15 rows

Other info

Follow for update

@wizwand_team Discord