Can We Edit Factual Knowledge by In-Context Learning?
About
Previous studies have shown that large language models (LLMs) like GPTs store massive factual knowledge in their parameters. However, the stored knowledge could be false or out-dated. Traditional knowledge editing methods refine LLMs via fine-tuning on texts containing specific knowledge. However, with the increasing scales of LLMs, these gradient-based approaches bring large computation costs. The trend of model-as-a-service also makes it impossible to modify knowledge in black-box LMs. Inspired by in-context learning (ICL), a new paradigm based on demonstration contexts without parameter updating, we explore whether ICL can edit factual knowledge. To answer this question, we give a comprehensive empirical study of ICL strategies. Experiments show that in-context knowledge editing (IKE), without any gradient and parameter updating, achieves a competitive success rate compared to gradient-based methods on GPT-J (6B) but with much fewer side effects, including less over-editing on similar but unrelated facts and less knowledge forgetting on previously stored knowledge. We also apply the method to larger LMs with tens or hundreds of parameters like OPT-175B, which shows the scalability of our method. The code is available at https://github.com/Zce1112zslx/IKE.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Knowledge Editing | CounterFact | Efficacy100 | 91 | |
| Knowledge Editing | MMEdit E-VQA | Reliability100 | 61 | |
| Knowledge Editing | E-VQA MMEdit 1.0 (test) | Reliability99.95 | 24 | |
| Knowledge Editing | MMEdit E-IC 1.0 (test) | Reliability94.4 | 24 | |
| Personalization Editing | UPQA balanced 100-sample | Explicit Accuracy74 | 24 | |
| Knowledge Editing | MzsRE Edit: EN, Test: EN | Reliability1.00e+4 | 23 | |
| Multimodal Knowledge Editing | MMQAKE Original Image | M-Acc38.93 | 18 | |
| Multimodal Knowledge Editing | MMQAKE Rephrased Image | M-Acc37.61 | 18 | |
| Knowledge Editing | MMEdit E-IC | Reliability96.7 | 16 | |
| Knowledge Editing | Multilingual Knowledge Editing EN LLaMA backbone (test) | Reliability57.67 | 16 |