GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing

About

We present GraPPa, an effective pre-training approach for table semantic parsing that learns a compositional inductive bias in the joint representations of textual and tabular data. We construct synthetic question-SQL pairs over high-quality tables via a synchronous context-free grammar (SCFG) induced from existing text-to-SQL datasets. We pre-train our model on the synthetic data using a novel text-schema linking objective that predicts the syntactic role of a table field in the SQL for each question-SQL pair. To maintain the model's ability to represent real-world data, we also include masked language modeling (MLM) over several existing table-and-language datasets to regularize the pre-training process. On four popular fully supervised and weakly supervised table semantic parsing benchmarks, GraPPa significantly outperforms RoBERTa-large as the feature representation layers and establishes new state-of-the-art results on all of them.

Tao Yu, Chien-Sheng Wu, Xi Victoria Lin, Bailin Wang, Yi Chern Tan, Xinyi Yang, Dragomir Radev, Richard Socher, Caiming Xiong• 2020

Related benchmarks

Task	Dataset	Result
Text-to-SQL	Spider (test)	--	213
Text-to-SQL	Spider (dev)	--	147
Table Question Answering	WikiTQ (test)	Accuracy52.7	140
Text-to-SQL	Spider 1.0 (test)	EM Acc (Overall)69.6	110
Text-to-SQL	Spider 1.0 (dev)	Exact Match Accuracy73.4	92
Table Question Answering	WikiTableQuestions (test)	Accuracy52.7	86
Table Question Answering	WikiSQL (test)	Accuracy84.7	55
Table-based Question Answering	WIKITABLEQUESTIONS (dev)	Accuracy51.9	25
Table Question Answering	WIKISQL WEAK (test)	Denotation Accuracy84.7	20
Table Question Answering	WIKISQL WEAK (dev)	Denotation Accuracy85.9	19

Showing 10 of 19 rows

Other info

Code

Follow for update

@wizwand_team Discord