Unlocking Compositional Generalization in Pre-trained Models Using Intermediate Representations

About

Sequence-to-sequence (seq2seq) models are prevalent in semantic parsing, but have been found to struggle at out-of-distribution compositional generalization. While specialized model architectures and pre-training of seq2seq models have been proposed to address this issue, the former often comes at the cost of generality and the latter only shows limited success. In this paper, we study the impact of intermediate representations on compositional generalization in pre-trained seq2seq models, without changing the model architecture at all, and identify key aspects for designing effective representations. Instead of training to directly map natural language to an executable form, we map to a reversible or lossy intermediate representation that has stronger structural correspondence with natural language. The combination of our proposed intermediate representations and pre-trained models is surprisingly effective, where the best combinations obtain a new state-of-the-art on CFQ (+14.8 accuracy points) and on the template-splits of three text-to-SQL datasets (+15.0 to +19.4 accuracy points). This work highlights that intermediate representations provide an important and potentially overlooked degree of freedom for improving the compositional generalization abilities of pre-trained seq2seq models.

Jonathan Herzig, Peter Shaw, Ming-Wei Chang, Kelvin Guu, Panupong Pasupat, Yuan Zhang• 2021

Related benchmarks

Task	Dataset	Result
Semantic Parsing	CFQ (MCD2)	Accuracy85.3	33
Semantic Parsing	CFQ MCD3	Accuracy77.9	33
Semantic Parsing	CFQ (MCD1)	Accuracy88.4	33
Semantic Parsing	CFQ MCD avg	Exact Match Accuracy83.8	22
Text-to-SQL	Geoquery	Exact Match Accuracy83	17
Semantic Parsing	CFQ MCD1 (test)	Accuracy88.7	15
Semantic Parsing	CFQ MCD2 (test)	Accuracy0.853	15
Semantic Parsing	CFQ MCD3 (test)	Accuracy77.9	15
Text-to-SQL	ATIS	Exact Match Accuracy47.8	13
Text-to-SQL	ATIS template (test)	Accuracy47.8	12

Showing 10 of 14 rows

Other info

Code

Follow for update

@wizwand_team Discord