Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments

About

A key missing capacity of current language models (LMs) is grounding to real-world environments. Most existing work for grounded language understanding uses LMs to directly generate plans that can be executed in the environment to achieve the desired effects. It thereby casts the burden of ensuring grammaticality, faithfulness, and controllability all on the LMs. We propose Pangu, a generic framework for grounded language understanding that capitalizes on the discriminative ability of LMs instead of their generative ability. Pangu consists of a symbolic agent and a neural LM working in a concerted fashion: The agent explores the environment to incrementally construct valid plans, and the LM evaluates the plausibility of the candidate plans to guide the search process. A case study on the challenging problem of knowledge base question answering (KBQA), which features a massive environment, demonstrates the remarkable effectiveness and flexibility of Pangu: A BERT-base LM is sufficient for setting a new record on standard KBQA datasets, and larger LMs further bring substantial gains. Pangu also enables, for the first time, effective few-shot in-context learning for KBQA with large LMs such as Codex.

Yu Gu, Xiang Deng, Yu Su• 2022

Related benchmarks

TaskDatasetResultRank
Knowledge Base Question AnsweringWEBQSP (test)--
143
Knowledge Base Question AnsweringWebQSP Freebase (test)
F1 Score79.6
46
Knowledge Base Question AnsweringWebQSP → GrailQA-Tech (test)
F1 Score51
36
Knowledge Base Question AnsweringGrailQA v1.0 (test)
Overall EM75.4
33
Knowledge Base Question AnsweringGrailQA (test)
F191.76
27
Knowledge Base Question AnsweringGrailQAbility answerable 1.0 (test)
F1 (L)79.45
27
Knowledge Graph Question AnsweringGrailQA (Overall)
Hits@175.4
20
Knowledge Base Question AnsweringWebQSP → GraphQA-Pop (test)
F136.7
20
Knowledge Base Question AnsweringGraphQ (test)
F166.7
19
Knowledge Base Question AnsweringGrailQAbility unanswerable 1.0 (test)
F1 (L)86.48
19
Showing 10 of 24 rows

Other info

Code

Follow for update