Scent of Knowledge: Optimizing Search-Enhanced Reasoning with Information Foraging
About
Augmenting large language models (LLMs) with external retrieval has become a standard method to address their inherent knowledge cutoff limitations. However, traditional retrieval-augmented generation methods employ static, pre-inference retrieval strategies, making them inadequate for complex tasks involving ambiguous, multi-step, or evolving information needs. Recent advances in test-time scaling techniques have demonstrated significant potential in enabling LLMs to dynamically interact with external tools, motivating the shift toward adaptive inference-time retrieval. Inspired by Information Foraging Theory (IFT), we propose InForage, a reinforcement learning framework that formalizes retrieval-augmented reasoning as a dynamic information-seeking process. Unlike existing approaches, InForage explicitly rewards intermediate retrieval quality, encouraging LLMs to iteratively gather and integrate information through adaptive search behaviors. To facilitate training, we construct a human-guided dataset capturing iterative search and reasoning trajectories for complex, real-world web tasks. Extensive evaluations across general question answering, multi-hop reasoning tasks, and a newly developed real-time web QA dataset demonstrate InForage's superior performance over baseline methods. These results highlight InForage's effectiveness in building robust, adaptive, and efficient reasoning agents.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multi-hop Question Answering | 2WikiMultihopQA | EM42.8 | 387 | |
| Multi-hop Question Answering | HotpotQA (test) | -- | 255 | |
| Multi-hop Question Answering | 2WikiMultiHopQA (test) | EM42.8 | 195 | |
| Question Answering | PopQA | -- | 186 | |
| Multi-hop Question Answering | Bamboogle | Exact Match36 | 128 | |
| Multi-hop Question Answering | HotpotQA | Exact Match (EM)40.9 | 117 | |
| Question Answering | TriviaQA | -- | 112 | |
| Multi-hop Question Answering | MuSiQue (test) | -- | 111 | |
| Question Answering | HotpotQA | EM40.9 | 109 | |
| Question Answering | 2WikiMultihopQA | EM42.8 | 107 |