NeuroPath: Neurobiology-Inspired Path Tracking and Reflection for Semantically Coherent Retrieval
About
Retrieval-augmented generation (RAG) greatly enhances large language models (LLMs) performance in knowledge-intensive tasks. However, naive RAG methods struggle with multi-hop question answering due to their limited capacity to capture complex dependencies across documents. Recent studies employ graph-based RAG to capture document connections. However, these approaches often result in a loss of semantic coherence and introduce irrelevant noise during node matching and subgraph construction. To address these limitations, we propose NeuroPath, an LLM-driven semantic path tracking RAG framework inspired by the path navigational planning of place cells in neurobiology. It consists of two steps: Dynamic Path Tracking and Post-retrieval Completion. Dynamic Path Tracking performs goal-directed semantic path tracking and pruning over the constructed knowledge graph (KG), improving noise reduction and semantic coherence. Post-retrieval Completion further reinforces these benefits by conducting second-stage retrieval using intermediate reasoning and the original query to refine the query goal and complete missing information in the reasoning path. NeuroPath surpasses current state-of-the-art baselines on three multi-hop QA datasets, achieving average improvements of 16.3% on recall@2 and 13.5% on recall@5 over advanced graph-based RAG methods. Moreover, compared to existing iter-based RAG methods, NeuroPath achieves higher accuracy and reduces token consumption by 22.8%. Finally, we demonstrate the robustness of NeuroPath across four smaller LLMs (Llama3.1, GLM4, Mistral0.3, and Gemma3), and further validate its scalability across tasks of varying complexity. Code is available at https://github.com/KennyCaty/NeuroPath.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multi-hop Question Answering | 2WikiMultihopQA | EM63.4 | 387 | |
| Multi-hop Question Answering | HotpotQA | F1 Score64.9 | 294 | |
| Multi-hop Question Answering | MuSiQue | F143.5 | 38 | |
| Multi-hop QA Retrieval | MuSiQue | R@247.3 | 36 | |
| Multi-hop QA Retrieval | 2Wiki | Recall@277.9 | 23 | |
| Multi-hop Retrieval | HotpotQA | Recall@276.2 | 23 | |
| Multi-hop Retrieval | Average MuSiQue, 2wiki, HotpotQA | R@267.1 | 10 |