KoCo: Conditioning Language Model Pre-training on Knowledge Coordinates

About

Standard Large Language Model (LLM) pre-training typically treats corpora as flattened token sequences, often overlooking the real-world context that humans naturally rely on to contextualize information. To bridge this gap, we introduce Knowledge Coordinate Conditioning (KoCo), a simple method that maps every document into a three-dimensional semantic coordinate. By prepending these coordinates as textual prefixes for pre-training, we aim to equip the model with explicit contextual awareness to learn the documents within the real-world knowledge structure. Experiment results demonstrate that KoCo significantly enhances performance across 10 downstream tasks and accelerates pre-training convergence by approximately 30\%. Furthermore, our analysis indicates that explicitly modeling knowledge coordinates helps the model distinguish stable facts from noise, effectively mitigating hallucination in generated outputs.

Yudong Li, Jiawei Cai, Linlin Shen• 2026

Related benchmarks

Task	Dataset	Result
Instruction Following	IFEval	--	836
Question Answering	ARC Challenge	Accuracy (ARC)44.11	598
Question Answering	ARC Easy	--	597
Physical Interaction Question Answering	PIQA	Accuracy74.8	415
Common Sense Reasoning	COPA	Accuracy83	256
Social Interaction Question Answering	SIQA	Accuracy53.4	157
Question Answering	OpenBookQA	Accuracy51.2	119
Commonsense Question Answering	CSQA	Accuracy61.83	71
Truthfulness	TruthfulQA	Truthfulness Score36.61	16

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord