Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

RhinoInsight: Improving Deep Research through Control Mechanisms for Model Behavior and Context

About

Large language models are evolving from single-turn responders into tool-using agents capable of sustained reasoning and decision-making for deep research. Prevailing systems adopt a linear pipeline of plan to search to write to a report, which suffers from error accumulation and context rot due to the lack of explicit control over both model behavior and context. We introduce RhinoInsight, a deep research framework that adds two control mechanisms to enhance robustness, traceability, and overall quality without parameter updates. First, a Verifiable Checklist module transforms user requirements into traceable and verifiable sub-goals, incorporates human or LLM critics for refinement, and compiles a hierarchical outline to anchor subsequent actions and prevent non-executable planning. Second, an Evidence Audit module structures search content, iteratively updates the outline, and prunes noisy context, while a critic ranks and binds high-quality evidence to drafted content to ensure verifiability and reduce hallucinations. Our experiments demonstrate that RhinoInsight achieves state-of-the-art performance on deep research tasks while remaining competitive on deep search tasks.

Yu Lei, Shuzheng Si, Wei Wang, Yifei Wu, Gang Chen, Fanchao Qi, Maosong Sun• 2025

Related benchmarks

TaskDatasetResultRank
Deep Research Report GenerationDeepResearch Bench
Comprehensiveness50.51
54
Comparative Performance EvaluationDeepConsult
Win Rate0.6851
24
Open-ended deep research evaluationDeepResearch Bench 100 PhD-level research tasks
Comprehensiveness50.51
9
Deep ResearchDeepConsult (test)
Win Rate68.51
8
Showing 4 of 4 rows

Other info

Follow for update