Learning to Edit Knowledge via Instruction-based Chain-of-Thought Prompting

About

Large language models (LLMs) can effectively handle outdated information through knowledge editing. However, current approaches face two key limitations: (I) Poor generalization: Most approaches rigidly inject new knowledge without ensuring that the model can use it effectively to solve practical problems. (II) Narrow scope: Current methods focus primarily on structured fact triples, overlooking the diverse unstructured forms of factual information (e.g., news, articles) prevalent in real-world contexts. To address these challenges, we propose a new paradigm: teaching LLMs to edit knowledge via Chain of Thoughts (CoTs) reasoning (CoT2Edit). We first leverage language model agents for both structured and unstructured edited data to generate CoTs, building high-quality instruction data. The model is then trained to reason over edited knowledge through supervised fine-tuning (SFT) and Group Relative Policy Optimization (GRPO). At inference time, we integrate Retrieval-Augmented Generation (RAG) to dynamically retrieve relevant edited facts for real-time knowledge editing. Experimental results demonstrate that our method achieves strong generalization across six diverse knowledge editing scenarios with just a single round of training on three open-source language models. The codes are available at https://github.com/FredJDean/CoT2Edit.

Jinhu Fu, Yan Bai, Longzhu He, Yihang Lou, Yanxiao Zhao, Li Sun, Sen Su• 2026

Related benchmarks

Task	Dataset	Result
Knowledge Editing	CounterFact	Efficacy99.13	362
Knowledge Editing	zsRE	--	268
Knowledge Editing	MQuAKE	Edit Success Rate99.95	30
Knowledge Editing	Counterfact uns	Edit Success Rate94.56	30
Knowledge Editing	WikiUpdate	Edit Success85.89	30
Sentiment editing	ConvSent	Success Rate (1K Edits)87.25	14

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord