MindRef: Mimicking Human Memory for Hierarchical Reference Retrieval with Fine-Grained Location Awareness

About

When completing knowledge-intensive tasks, humans sometimes need an answer and a corresponding reference passage for auxiliary reading. Previous methods required obtaining pre-segmented article chunks through additional retrieval models. This paper explores leveraging the parameterized knowledge stored during the pre-training phase of large language models (LLMs) to recall reference passage from any starting position independently. We propose a two-stage framework that simulates the scenario of humans recalling easily forgotten references. Initially, the LLM is prompted to recall document title identifiers to obtain a coarse-grained document set. Then, based on the acquired coarse-grained document set, it recalls fine-grained passage. In the two-stage recall process, we use constrained decoding to ensure that content outside of the stored documents is not generated. To increase speed, we only recall a short prefix in the second stage, and then locate its position to retrieve a complete passage. Experiments on KILT knowledge-sensitive tasks have verified that LLMs can independently recall reference passage locations in various task forms, and the obtained reference significantly assists downstream tasks.

Ye Wang, Xinrun Xu, Zhiming Ding• 2024

Related benchmarks

Task	Dataset	Result
Question Answering	HotpotQA	EM26.13	173
Question Answering	NQ	EM31.69	69
Dialogue	WoW	F1 Score14.77	8
Fact Checking	FEVER	Accuracy78.79	8
Open-domain QA	TriviaQA	EM72.94	8
Open-domain QA	ELI5	R-L20.61	8
Fine-grained passage-level retrieval	TriviaQA	Answer in Context68.2	7
Fine-grained passage-level retrieval	HotpotQA	Answer in Context30.04	7
Fine-grained passage-level retrieval	ELI5	Answer in Context16.85	7
Fine-grained passage-level retrieval	FEVER	Entity in Context58.42	7

Showing 10 of 12 rows

Other info

Code

Follow for update

@wizwand_team Discord