Laser: Governing Long-Horizon Agentic Search via Structured Protocol and Context Register

About

Recent advances in Large Language Models (LLMs) and Large Reasoning Models (LRMs) have enabled agentic search systems that interleave multi-step reasoning with external tool use. However, existing frameworks largely rely on unstructured natural-language reasoning and accumulate raw intermediate traces in the context, which often leads to unstable reasoning trajectories, context overflow, and degraded performance on complex multi-hop queries. In this study, we introduce Laser, a general framework for stabilizing and scaling agentic search. Laser defines a symbolic action protocol that organizes agent behaviors into three spaces: planning, task-solving, and retrospection. Each action is specified with explicit semantics and a deterministic execution format, enabling structured and logical reasoning processes and reliable action parsing. This design makes intermediate decisions interpretable and traceable, enhancing explicit retrospection and fine-grained control over reasoning trajectories. In coordination with parsable actions, Laser further maintains a compact context register that stores only essential states of the reasoning process, allowing the agent to reason over long horizons without uncontrolled context expansion. Experiments on Qwen2.5/3-series models across challenging multi-hop QA datasets show that Laser consistently outperforms existing agentic search baselines under both prompting-only and fine-tuning settings, demonstrating that Laser provides a principled and effective foundation for robust, scalable agentic search.

Shuting Wang, Qiaolin Xia, Vich Wang, Herberttli, Bobsimons, Zhicheng Dou• 2025

Related benchmarks

Task	Dataset	Result
Agentic Search	MuSiQue	--	14
Agentic Search	BrowseComp-ZH (test)	LJFT21.45	12
Agentic Search	Bamboogle	LJFT Score64.8	12
Agentic Search	Web Dancer	LJFT49.24	12
Multi-hop Question Answering	BrowseComp-ZH	LJFT9.69	5
Multi-hop Question Answering	MuSiQue	LJFT23.6	5
Multi-hop Question Answering	Web Dancer	LJFT47.72	5
Multi-hop Question Answering	Average (BrowseComp-ZH, Bamboogle, MuSiQue, Web Dancer) (Overall)	LJFT Score35.65	5
Multi-hop Question Answering	Bamboogle	LJFT Score61.6	5

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord