Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments

About

A key missing capacity of current language models (LMs) is grounding to real-world environments. Most existing work for grounded language understanding uses LMs to directly generate plans that can be executed in the environment to achieve the desired effects. It thereby casts the burden of ensuring grammaticality, faithfulness, and controllability all on the LMs. We propose Pangu, a generic framework for grounded language understanding that capitalizes on the discriminative ability of LMs instead of their generative ability. Pangu consists of a symbolic agent and a neural LM working in a concerted fashion: The agent explores the environment to incrementally construct valid plans, and the LM evaluates the plausibility of the candidate plans to guide the search process. A case study on the challenging problem of knowledge base question answering (KBQA), which features a massive environment, demonstrates the remarkable effectiveness and flexibility of Pangu: A BERT-base LM is sufficient for setting a new record on standard KBQA datasets, and larger LMs further bring substantial gains. Pangu also enables, for the first time, effective few-shot in-context learning for KBQA with large LMs such as Codex.

Yu Gu, Xiang Deng, Yu Su• 2022

Related benchmarks

Task	Dataset	Result
Knowledge Base Question Answering	WEBQSP (test)	--	145
Knowledge Base Question Answering	WebQSP Freebase (test)	--	60
Knowledge Base Question Answering	WebQSP → GrailQA-Tech (test)	F1 Score51	36
Knowledge Base Question Answering	GrailQA v1.0 (test)	Overall EM75.4	33
Knowledge Base Question Answering	GraphQ (test)	F166.7	32
Knowledge Base Question Answering	GrailQA (test)	F191.76	27
Knowledge Base Question Answering	GrailQAbility answerable 1.0 (test)	F1 (L)79.45	27
Knowledge Graph Question Answering	GrailQA (Overall)	Hits@175.4	20
Knowledge Base Question Answering	WebQSP → GraphQA-Pop (test)	F136.7	20
Knowledge Base Question Answering	GrailQAbility unanswerable 1.0 (test)	F1 (L)86.48	19

Showing 10 of 24 rows

Other info

Code

Follow for update

@wizwand_team Discord