Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

KoCo: Conditioning Language Model Pre-training on Knowledge Coordinates

About

Standard Large Language Model (LLM) pre-training typically treats corpora as flattened token sequences, often overlooking the real-world context that humans naturally rely on to contextualize information. To bridge this gap, we introduce Knowledge Coordinate Conditioning (KoCo), a simple method that maps every document into a three-dimensional semantic coordinate. By prepending these coordinates as textual prefixes for pre-training, we aim to equip the model with explicit contextual awareness to learn the documents within the real-world knowledge structure. Experiment results demonstrate that KoCo significantly enhances performance across 10 downstream tasks and accelerates pre-training convergence by approximately 30\%. Furthermore, our analysis indicates that explicitly modeling knowledge coordinates helps the model distinguish stable facts from noise, effectively mitigating hallucination in generated outputs.

Yudong Li, Jiawei Cai, Linlin Shen• 2026

Related benchmarks

TaskDatasetResultRank
Instruction FollowingIFEval--
625
Question AnsweringARC Easy--
597
Physical Interaction Question AnsweringPIQA
Accuracy74.8
333
Common Sense ReasoningCOPA
Accuracy83
197
Question AnsweringARC Challenge
Accuracy (ARC)44.11
142
Question AnsweringOpenBookQA
Accuracy51.2
119
Social Interaction Question AnsweringSIQA
Accuracy53.4
109
Commonsense Question AnsweringCSQA
Accuracy61.83
58
TruthfulnessTruthfulQA
Truthfulness Score36.61
16
Showing 9 of 9 rows

Other info

Follow for update