Search-R3: Unifying Reasoning and Embedding in Large Language Models
About
Despite their remarkable natural language understanding capabilities, Large Language Models (LLMs) have been underutilized for retrieval tasks. We present Search-R3, a novel framework that addresses this limitation by adapting LLMs to generate search embeddings as a direct output of their reasoning process. Our approach exploits LLMs' chain-of-thought capabilities, allowing them to produce more effective embeddings by reasoning step-by-step through complex semantic analyses. We implement this through three complementary mechanisms. (1) a supervised learning stage enables the model's ability to produce quality embeddings, (2) a reinforcement learning (RL) methodology that optimizes embedding generation alongside reasoning, and (3) a specialized RL environment that efficiently handles evolving embedding representations without requiring complete corpus re-encoding at each training iteration. Our extensive evaluations on diverse benchmarks demonstrate that Search-R3 significantly outperforms prior methods by unifying the reasoning and embedding generation processes. This integrated post-training approach represents a substantial advancement in handling complex knowledge-intensive tasks that require both sophisticated reasoning and effective information retrieval. Project page: https://github.com/ytgui/Search-R3
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Information Retrieval | BRIGHT | Biology nDCG@1013.8 | 45 | |
| Information Retrieval | SciFact | nDCG@100.667 | 36 | |
| Information Retrieval | MedQA | nDCG@1077.9 | 23 | |
| Information Retrieval | FollowIR | MAP@5 (Robust04)3.2 | 17 | |
| Information Retrieval | BrowseComp+ | Recall@50.00e+0 | 17 | |
| Information Retrieval | DS1000 | nDCG@1061.1 | 11 | |
| Retrieval | Wikipedia retrieval dataset synthetic (test) | nDCG@1089.5 | 10 | |
| Retrieval | LitSearch | nDCG@10.326 | 6 | |
| Retrieval | MKQA eng | nDCG@115.1 | 6 | |
| Retrieval | SciFact | nDCG@10.56 | 6 |