Laser: Governing Long-Horizon Agentic Search via Structured Protocol and Context Register
About
Recent advances in Large Language Models (LLMs) and Large Reasoning Models (LRMs) have enabled agentic search systems that interleave multi-step reasoning with external tool use. However, existing frameworks largely rely on unstructured natural-language reasoning and accumulate raw intermediate traces in the context, which often leads to unstable reasoning trajectories, context overflow, and degraded performance on complex multi-hop queries. In this study, we introduce Laser, a general framework for stabilizing and scaling agentic search. Laser defines a symbolic action protocol that organizes agent behaviors into three spaces: planning, task-solving, and retrospection. Each action is specified with explicit semantics and a deterministic execution format, enabling structured and logical reasoning processes and reliable action parsing. This design makes intermediate decisions interpretable and traceable, enhancing explicit retrospection and fine-grained control over reasoning trajectories. In coordination with parsable actions, Laser further maintains a compact context register that stores only essential states of the reasoning process, allowing the agent to reason over long horizons without uncontrolled context expansion. Experiments on Qwen2.5/3-series models across challenging multi-hop QA datasets show that Laser consistently outperforms existing agentic search baselines under both prompting-only and fine-tuning settings, demonstrating that Laser provides a principled and effective foundation for robust, scalable agentic search.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Agentic Search | BrowseComp-ZH (test) | LJFT21.45 | 12 | |
| Agentic Search | Bamboogle | LJFT Score64.8 | 12 | |
| Agentic Search | MuSiQue | LJFT Score24.2 | 12 | |
| Agentic Search | Web Dancer | LJFT49.24 | 12 | |
| Multi-hop Question Answering | BrowseComp-ZH | LJFT9.69 | 5 | |
| Multi-hop Question Answering | MuSiQue | LJFT23.6 | 5 | |
| Multi-hop Question Answering | Web Dancer | LJFT47.72 | 5 | |
| Multi-hop Question Answering | Average (BrowseComp-ZH, Bamboogle, MuSiQue, Web Dancer) (Overall) | LJFT Score35.65 | 5 | |
| Multi-hop Question Answering | Bamboogle | LJFT Score61.6 | 5 |