Learning to Edit Knowledge via Instruction-based Chain-of-Thought Prompting
About
Large language models (LLMs) can effectively handle outdated information through knowledge editing. However, current approaches face two key limitations: (I) Poor generalization: Most approaches rigidly inject new knowledge without ensuring that the model can use it effectively to solve practical problems. (II) Narrow scope: Current methods focus primarily on structured fact triples, overlooking the diverse unstructured forms of factual information (e.g., news, articles) prevalent in real-world contexts. To address these challenges, we propose a new paradigm: teaching LLMs to edit knowledge via Chain of Thoughts (CoTs) reasoning (CoT2Edit). We first leverage language model agents for both structured and unstructured edited data to generate CoTs, building high-quality instruction data. The model is then trained to reason over edited knowledge through supervised fine-tuning (SFT) and Group Relative Policy Optimization (GRPO). At inference time, we integrate Retrieval-Augmented Generation (RAG) to dynamically retrieve relevant edited facts for real-time knowledge editing. Experimental results demonstrate that our method achieves strong generalization across six diverse knowledge editing scenarios with just a single round of training on three open-source language models. The codes are available at https://github.com/FredJDean/CoT2Edit.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Knowledge Editing | CounterFact | Efficacy99.13 | 301 | |
| Knowledge Editing | zsRE | -- | 181 | |
| Knowledge Editing | MQuAKE | Edit Success Rate99.95 | 30 | |
| Knowledge Editing | Counterfact uns | Edit Success Rate94.56 | 30 | |
| Knowledge Editing | WikiUpdate | Edit Success85.89 | 30 | |
| Sentiment editing | ConvSent | Success Rate (1K Edits)87.25 | 14 |