K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters
About
We study the problem of injecting knowledge into large pre-trained models like BERT and RoBERTa. Existing methods typically update the original parameters of pre-trained models when injecting knowledge. However, when multiple kinds of knowledge are injected, the historically injected knowledge would be flushed away. To address this, we propose K-Adapter, a framework that retains the original parameters of the pre-trained model fixed and supports the development of versatile knowledge-infused model. Taking RoBERTa as the backbone model, K-Adapter has a neural adapter for each kind of infused knowledge, like a plug-in connected to RoBERTa. There is no information flow between different adapters, thus multiple adapters can be efficiently trained in a distributed way. As a case study, we inject two kinds of knowledge in this work, including (1) factual knowledge obtained from automatically aligned text-triplets on Wikipedia and Wikidata and (2) linguistic knowledge obtained via dependency parsing. Results on three knowledge-driven tasks, including relation classification, entity typing, and question answering, demonstrate that each adapter improves the performance and the combination of both adapters brings further improvements. Further analysis indicates that K-Adapter captures versatile knowledge than RoBERTa.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Relation Extraction | TACRED (test) | F1 Score72 | 194 | |
| Relation Extraction | TACRED | Micro F172.04 | 97 | |
| Relation Extraction | Wiki80 | Accuracy0.86 | 51 | |
| Commonsense Question Answering | CosmosQA | Accuracy81.83 | 36 | |
| Entity Typing | Wiki-ET | F1 Score77.7 | 24 | |
| Fine-Grained Entity Typing | FIGER (test) | Macro F184.87 | 22 | |
| Relation Extraction | TACRED v1.0 (5% train) | Micro F10.516 | 19 | |
| Relation Extraction | TACRED v1.0 (full) | Micro F172 | 16 | |
| Open-domain Question Answering | SearchQA | EM61.96 | 13 | |
| Relation Extraction | TACRED v1.0 (10% train) | Micro F156 | 13 |