Scent of Knowledge: Optimizing Search-Enhanced Reasoning with Information Foraging
About
Augmenting large language models (LLMs) with external retrieval has become a standard method to address their inherent knowledge cutoff limitations. However, traditional retrieval-augmented generation methods employ static, pre-inference retrieval strategies, making them inadequate for complex tasks involving ambiguous, multi-step, or evolving information needs. Recent advances in test-time scaling techniques have demonstrated significant potential in enabling LLMs to dynamically interact with external tools, motivating the shift toward adaptive inference-time retrieval. Inspired by Information Foraging Theory (IFT), we propose InForage, a reinforcement learning framework that formalizes retrieval-augmented reasoning as a dynamic information-seeking process. Unlike existing approaches, InForage explicitly rewards intermediate retrieval quality, encouraging LLMs to iteratively gather and integrate information through adaptive search behaviors. To facilitate training, we construct a human-guided dataset capturing iterative search and reasoning trajectories for complex, real-world web tasks. Extensive evaluations across general question answering, multi-hop reasoning tasks, and a newly developed real-time web QA dataset demonstrate InForage's superior performance over baseline methods. These results highlight InForage's effectiveness in building robust, adaptive, and efficient reasoning agents.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multi-hop Question Answering | HotpotQA (test) | -- | 198 | |
| Multi-hop Question Answering | 2WikiMultiHopQA (test) | -- | 143 | |
| Multi-hop Question Answering | MuSiQue (test) | -- | 111 | |
| Multi-hop Question Answering | Bamboogle (test) | -- | 46 | |
| Single-hop Question Answering | PopQA (test) | Accuracy37.4 | 21 | |
| Single-hop Question Answering | NQ (Natural Questions) (test) | Accuracy38.6 | 21 | |
| Single-hop Question Answering | TriviaQA (test) | Accuracy50.4 | 21 |