Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

GreaseLM: Graph REASoning Enhanced Language Models for Question Answering

About

Answering complex questions about textual narratives requires reasoning over both stated context and the world knowledge that underlies it. However, pretrained language models (LM), the foundation of most modern QA systems, do not robustly represent latent relationships between concepts, which is necessary for reasoning. While knowledge graphs (KG) are often used to augment LMs with structured representations of world knowledge, it remains an open question how to effectively fuse and reason over the KG representations and the language context, which provides situational constraints and nuances. In this work, we propose GreaseLM, a new model that fuses encoded representations from pretrained LMs and graph neural networks over multiple layers of modality interaction operations. Information from both modalities propagates to the other, allowing language context representations to be grounded by structured world knowledge, and allowing linguistic nuances (e.g., negation, hedging) in the context to inform the graph representations of knowledge. Our results on three benchmarks in the commonsense reasoning (i.e., CommonsenseQA, OpenbookQA) and medical question answering (i.e., MedQA-USMLE) domains demonstrate that GreaseLM can more reliably answer questions that require reasoning over both situational constraints and structured knowledge, even outperforming models 8x larger.

Xikun Zhang, Antoine Bosselut, Michihiro Yasunaga, Hongyu Ren, Percy Liang, Christopher D. Manning, Jure Leskovec• 2022

Related benchmarks

TaskDatasetResultRank
Commonsense ReasoningHellaSwag
Accuracy82.8
1460
Commonsense ReasoningPIQA
Accuracy79.6
647
Commonsense ReasoningCSQA
Accuracy74.2
366
Commonsense ReasoningARC Challenge
Accuracy44.7
132
Question AnsweringOpenBookQA (OBQA) (test)
OBQA Accuracy84.8
130
Question AnsweringMedQA-USMLE (test)
Accuracy45.1
101
Question AnsweringPubMedQA (test)
Accuracy72.4
81
Commonsense ReasoningOBQA
Accuracy66.9
75
Question AnsweringMedQA (test)
Accuracy38.5
61
Question AnsweringCommonsenseQA IH (test)
Accuracy74.2
57
Showing 10 of 20 rows

Other info

Follow for update