GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing
About
We present GraPPa, an effective pre-training approach for table semantic parsing that learns a compositional inductive bias in the joint representations of textual and tabular data. We construct synthetic question-SQL pairs over high-quality tables via a synchronous context-free grammar (SCFG) induced from existing text-to-SQL datasets. We pre-train our model on the synthetic data using a novel text-schema linking objective that predicts the syntactic role of a table field in the SQL for each question-SQL pair. To maintain the model's ability to represent real-world data, we also include masked language modeling (MLM) over several existing table-and-language datasets to regularize the pre-training process. On four popular fully supervised and weakly supervised table semantic parsing benchmarks, GraPPa significantly outperforms RoBERTa-large as the feature representation layers and establishes new state-of-the-art results on all of them.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Text-to-SQL | Spider (test) | -- | 140 | |
| Text-to-SQL | Spider (dev) | -- | 100 | |
| Text-to-SQL | Spider 1.0 (dev) | Exact Match Accuracy73.4 | 92 | |
| Table Question Answering | WikiTQ (test) | Accuracy52.7 | 92 | |
| Text-to-SQL | Spider 1.0 (test) | EM Acc (Overall)69.6 | 91 | |
| Table Question Answering | WikiTableQuestions (test) | Accuracy52.7 | 86 | |
| Table Question Answering | WikiSQL (test) | Accuracy84.7 | 55 | |
| Table-based Question Answering | WIKITABLEQUESTIONS (dev) | Accuracy51.9 | 25 | |
| Table Question Answering | WIKISQL WEAK (test) | Denotation Accuracy84.7 | 20 | |
| Table Question Answering | WIKISQL WEAK (dev) | Denotation Accuracy85.9 | 19 |