KILM: Knowledge Injection into Encoder-Decoder Language Models

About

Large pre-trained language models (PLMs) have been shown to retain implicit knowledge within their parameters. To enhance this implicit knowledge, we propose Knowledge Injection into Language Models (KILM), a novel approach that injects entity-related knowledge into encoder-decoder PLMs, via a generative knowledge infilling objective through continued pre-training. This is done without architectural modifications to the PLMs or adding additional parameters. Experimental results over a suite of knowledge-intensive tasks spanning numerous datasets show that KILM enables models to retain more knowledge and hallucinate less, while preserving their original performance on general NLU and NLG tasks. KILM also demonstrates improved zero-shot performances on tasks such as entity disambiguation, outperforming state-of-the-art models having 30x more parameters.

Yan Xu, Mahdi Namazifar, Devamanyu Hazarika, Aishwarya Padmakumar, Yang Liu, Dilek Hakkani-T\"ur• 2023

Related benchmarks

Task	Dataset	Result
Question Answering	TriviaQA	Accuracy16.42	238
Question Answering	NQ	Accuracy7.83	123
Relational Knowledge Probing	LAMA original (test)	G-RE Score6.83	11
Knowledge-Grounded Dialogue Generation	Wizard of Wikipedia (WoW) Seen (test)	ROUGE-120.8	10
Knowledge-Grounded Dialogue Generation	WoW (Wizard of Wikipedia) unseen (test)	ROUGE-118.8	10
Entity Disambiguation	Entity Disambiguation Suite (AIDA, MSNBC, AQUAINT, ACE2004, CWEB, WIKI)	AIDA ED Score86.2	8
Entity Disambiguation	Entity Disambiguation Suite Zero-shot (AIDA, MSNBC, AQUAINT, ACE2004, CWEB, WIKI) BLINK Candidates (test)	AIDA Score82.1	8
Question Answering	WQ	Accuracy12.65	8

Showing 8 of 8 rows

Other info

Code

Follow for update

@wizwand_team Discord