Contextualized Representations Using Textual Encyclopedic Knowledge

About

We present a method to represent input texts by contextualizing them jointly with dynamically retrieved textual encyclopedic background knowledge from multiple documents. We apply our method to reading comprehension tasks by encoding questions and passages together with background sentences about the entities they mention. We show that integrating background knowledge from text is effective for tasks focusing on factual reasoning and allows direct reuse of powerful pretrained BERT-style encoders. Moreover, knowledge integration can be further improved with suitable pretraining via a self-supervised masked language model objective over words in background-augmented input text. On TriviaQA, our approach obtains improvements of 1.6 to 3.1 F1 over comparable RoBERTa models which do not integrate background knowledge dynamically. On MRQA, a large collection of diverse QA datasets, we see consistent gains in-domain along with large improvements out-of-domain on BioASQ (2.1 to 4.2 F1), TextbookQA (1.6 to 2.0 F1), and DuoRC (1.1 to 2.0 F1).

Mandar Joshi, Kenton Lee, Yi Luan, Kristina Toutanova• 2020

Related benchmarks

Task	Dataset	Result
Roll call vote prediction	Roll call vote prediction (Random)	BAcc92.43	27
Roll call vote prediction	Roll call vote prediction (Time-Based)	Balanced Accuracy92.63	26
Misinformation Detection	SLN (test)	Micro F182.72	26
political perspective detection	Allsides	Accuracy80.88	17
Misinformation Detection	LUN	Macro F156.73	17
political perspective detection	SemEval	Accuracy81.88	17
political perspective detection	Allsides (test)	Accuracy80.88	9
Misinformation Detection	LUN (test)	Micro F158.57	9
political perspective detection	SemEval (test)	Accuracy0.8188	9

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord