Structured Knowledge Representation through Contextual Pages for Retrieval-Augmented Generation

About

Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by incorporating external knowledge. Recently, some works have incorporated iterative knowledge accumulation processes into RAG models to progressively accumulate and refine query-related knowledge, thereby constructing more comprehensive knowledge representations. However, these iterative processes often lack a coherent organizational structure, which limits the construction of more comprehensive and cohesive knowledge representations. To address this, we propose PAGER, a page-driven autonomous knowledge representation framework for RAG. PAGER first prompts an LLM to construct a structured cognitive outline for a given question, which consists of multiple slots representing a distinct knowledge aspect. Then, PAGER iteratively retrieves and refines relevant documents to populate each slot, ultimately constructing a coherent page that serves as contextual input for guiding answer generation. Experiments on multiple knowledge-intensive benchmarks and backbone models show that PAGER consistently outperforms all RAG baselines. Further analyses demonstrate that PAGER constructs higher-quality and information-dense knowledge representations, better mitigates knowledge conflicts, and enables LLMs to leverage external knowledge more effectively. All code is available at https://github.com/OpenBMB/PAGER.

Xinze Li, Zhenghao Liu, Haidong Xin, Yukun Yan, Shuo Wang, Zheni Zeng, Sen Mei, Ge Yu, Maosong Sun• 2026

Related benchmarks

Task	Dataset	Result
Multi-hop Question Answering	HotpotQA (test)	F120.31	311
Multi-hop Question Answering	2WikiMultiHopQA (test)	EM11.8	226
Question Answering	2WikiMQA	--	66
Question Answering	HotpotQA	Cover EM52.4	18
Question Answering	MuSiQue	Cover EM24.3	18
Question Answering	Bamboogle	Cover Exact Match62.4	18
Question Answering	AmbigQA	Cover EM60	18
Question Answering	NQ	Cover EM0.565	18
Question Answering	HotpotQA (sampled)	Accuracy54	4
Question Answering	MuSiQue (sampled)	Accuracy31.5	4

Showing 10 of 14 rows

Other info

Follow for update

@wizwand_team Discord