Hybrid Deep Searcher: Scalable Parallel and Sequential Search Reasoning
About
Large reasoning models (LRMs) combined with retrieval-augmented generation (RAG) have enabled deep research agents capable of multi-step reasoning with external knowledge retrieval. However, we find that existing approaches rarely demonstrate test-time search scaling. Methods that extend reasoning through single-query sequential search suffer from limited evidence coverage, while approaches that generate multiple independent queries per step often lack structured aggregation, hindering deeper sequential reasoning. We propose a hybrid search strategy to address these limitations. We introduce HybridDeepSearcher, a structured search agent that integrates parallel query expansion with explicit evidence aggregation before advancing to deeper sequential reasoning. To supervise this behavior, we introduce HDS-QA, a novel dataset that guides models to combine broad parallel search with structured aggregation through supervised reasoning-query0retrieval trajectories containing parallel sub-queries. Across five benchmarks, HybridDeepSearcher significantly outperforms the state-of-the-art, improving F1 scores by +15.9 on FanOutQA and +9.2 on a subset of BrowseComp. Further analysis shows its consistent test-time search scaling: performance improves as additional search turns or calls are allowed, while competing methods plateau.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Question Answering | FRAMES | Accuracy54 | 14 | |
| Multi-constraint search problem solving | LiveDRBench (BrowseComp, DeepSearchQA, FRAMES, LiveDRBench, WebWalkerQA) 1.0 (test) | Accuracy16.3 | 14 | |
| Question Answering | FanOutQA | F1 Score44.1 | 9 | |
| Question Answering | MedBrowseComp | F1 Score23.2 | 9 | |
| Question Answering | Browsecomp | F115.1 | 9 | |
| Question Answering | MuSiQue | F1 Score31.2 | 9 | |
| Evidence Retrieval | MuSiQue | Evidence Coverage Rate40.7 | 6 | |
| Evidence Retrieval | FanOutQA | Evidence Coverage Rate61 | 6 | |
| Evidence Retrieval | FRAMES | Evidence Coverage Rate55.8 | 6 |