More Than Can Be Said: A Benchmark and Framework for Pre-Question Scientific Ideation

About

AI research agents have shown strong potential in automating literature search and manuscript refinement, yet most assume a clear and actionable initial input, operating only after a research question has been made explicit. In contrast, human research often begins with tacit friction, a sense of misalignment before a question can be formed. We introduce InciteResearch, a multi-agent framework designed to make a researcher's implicit understanding explicit, inspectable, and actionable. InciteResearch decomposes the logical chain of Socratic questioning and distributes it across the entire pipeline that: (1) Elicits a structured five-dimensional researcher profile state anchored by specific friction points from vague, even domain-unrelated inputs; (2) Violates hidden assumptions by maximizing the feasibility-novelty product with enforcing a 7-stage causal derivation trace; and (3) check whether the proposed method is a Necessary consequence of the reframed insight. We further introduce TF-Bench, the first benchmark for tacit-to-explicit research assistance that distinguishes domain-related from domain-unrelated inspirations across four scientific modes. On TF-Bench, InciteResearch achieves leapfrogging gains over a prompt-based baseline (novelty/impact from 3.671/3.806 to 4.250/4.397), shifting generated proposals from recombination to architectural insight. Our work demonstrates that AI can serve as an extension of thinking itself, rather than merely automating downstream execution.

Jie Yu, Song Qiu• 2026

Related benchmarks

Task	Dataset	Result
Research Proposal Generation	TF-Bench	Novelty4.328	5
Research Proposal Evaluation	TF-Bench RELATED	Novelty Score4.172	2
Research Proposal Evaluation	TF-Bench (OVERALL)	Novelty Score4.25	2

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord