Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Massive Editing for Large Language Models via Meta Learning

About

While large language models (LLMs) have enabled learning knowledge from the pre-training corpora, the acquired knowledge may be fundamentally incorrect or outdated over time, which necessitates rectifying the knowledge of the language model (LM) after the training. A promising approach involves employing a hyper-network to generate parameter shift, whereas existing hyper-networks suffer from inferior scalability in synchronous editing operation amount. To mitigate the problem, we propose the MAssive Language Model Editing Network (MALMEN), which formulates the parameter shift aggregation as the least square problem, subsequently updating the LM parameters using the normal equation. To accommodate editing multiple facts simultaneously with limited memory budgets, we separate the computation on the hyper-network and LM, enabling arbitrary batch size on both neural networks. Our method is evaluated by editing up to thousands of facts on LMs with different architectures, i.e., BERT-base, GPT-2, T5-XL (2.8B), and GPT-J (6B), across various knowledge-intensive NLP tasks, i.e., closed book fact-checking and question answering. Remarkably, MALMEN is capable of editing hundreds of times more facts than strong baselines with the identical hyper-network architecture and outperforms editor specifically designed for GPT. Our code is available at https://github.com/ChenmienTan/malmen.

Chenmien Tan, Ge Zhang, Jie Fu• 2023

Related benchmarks

TaskDatasetResultRank
Multitask Language UnderstandingMMLU (test)
Accuracy13.39
303
Knowledge EditingzsRE
Generality35.1
110
Commonsense Question AnsweringCommonsenseQA
Accuracy29.81
81
Model EditingRIPE
Reliability51.5
30
Model EditingCounterFact
Reliability52.4
30
Model EditingzsRE
Efficacy98.75
24
Model EditingCounterFact
Efficacy94.85
24
Model EditingzsRE
Reliability0.664
16
Sequential Model EditingZSRE (test)
Reliability99.1
14
Natural Language UnderstandingGLUE
NLI Accuracy47.48
6
Showing 10 of 10 rows

Other info

Follow for update