Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document Understanding

About

With the advent of pretrained language models (LMs), increasing research efforts have been focusing on infusing commonsense and domain-specific knowledge to prepare LMs for downstream tasks. These works attempt to leverage knowledge graphs, the de facto standard of symbolic knowledge representation, along with pretrained LMs. While existing approaches have leveraged external knowledge, it remains an open question how to jointly incorporate knowledge graphs representing varying contexts, from local (e.g., sentence), to document-level, to global knowledge, to enable knowledge-rich exchange across these contexts. Such rich contextualization can be especially beneficial for long document understanding tasks since standard pretrained LMs are typically bounded by the input sequence length. In light of these challenges, we propose KALM, a Knowledge-Aware Language Model that jointly leverages knowledge in local, document-level, and global contexts for long document understanding. KALM first encodes long documents and knowledge graphs into the three knowledge-aware context representations. It then processes each context with context-specific layers, followed by a context fusion layer that facilitates knowledge exchange to derive an overarching document representation. Extensive experiments demonstrate that KALM achieves state-of-the-art performance on six long document understanding tasks and datasets. Further analyses reveal that the three knowledge-aware contexts are complementary and they all contribute to model performance, while the importance and information exchange patterns of different contexts vary with respect to different tasks and datasets.

Shangbin Feng, Zhaoxuan Tan, Wenqian Zhang, Zhenyu Lei, Yulia Tsvetkov• 2022

Related benchmarks

TaskDatasetResultRank
Roll call vote predictionRoll call vote prediction (Random)
BAcc92.36
27
Misinformation DetectionSLN (test)
Micro F194.22
26
Roll call vote predictionRoll call vote prediction (Time-Based)
Balanced Accuracy94.46
26
Misinformation DetectionLUN
Macro F169.82
17
political perspective detectionSemEval
Accuracy91.45
17
political perspective detectionAllsides
Accuracy87.26
17
Misinformation DetectionLUN (test)
Micro F171.28
9
political perspective detectionSemEval (test)
Accuracy0.9145
9
political perspective detectionAllsides (test)
Accuracy87.26
9
Showing 9 of 9 rows

Other info

Code

Follow for update