MindRef: Mimicking Human Memory for Hierarchical Reference Retrieval with Fine-Grained Location Awareness
About
When completing knowledge-intensive tasks, humans sometimes need an answer and a corresponding reference passage for auxiliary reading. Previous methods required obtaining pre-segmented article chunks through additional retrieval models. This paper explores leveraging the parameterized knowledge stored during the pre-training phase of large language models (LLMs) to recall reference passage from any starting position independently. We propose a two-stage framework that simulates the scenario of humans recalling easily forgotten references. Initially, the LLM is prompted to recall document title identifiers to obtain a coarse-grained document set. Then, based on the acquired coarse-grained document set, it recalls fine-grained passage. In the two-stage recall process, we use constrained decoding to ensure that content outside of the stored documents is not generated. To increase speed, we only recall a short prefix in the second stage, and then locate its position to retrieve a complete passage. Experiments on KILT knowledge-sensitive tasks have verified that LLMs can independently recall reference passage locations in various task forms, and the obtained reference significantly assists downstream tasks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Question Answering | HotpotQA | EM26.13 | 79 | |
| Question Answering | NQ | EM31.69 | 57 | |
| Dialogue | WoW | F1 Score14.77 | 8 | |
| Fact Checking | FEVER | Accuracy78.79 | 8 | |
| Open-domain QA | TriviaQA | EM72.94 | 8 | |
| Open-domain QA | ELI5 | R-L20.61 | 8 | |
| Fine-grained passage-level retrieval | TriviaQA | Answer in Context68.2 | 7 | |
| Fine-grained passage-level retrieval | HotpotQA | Answer in Context30.04 | 7 | |
| Fine-grained passage-level retrieval | ELI5 | Answer in Context16.85 | 7 | |
| Fine-grained passage-level retrieval | FEVER | Entity in Context58.42 | 7 |