CHESS: Contextual Harnessing for Efficient SQL Synthesis

About

Translating natural language questions into SQL queries, known as text-to-SQL, is a long-standing research problem. Effective text-to-SQL synthesis can become very challenging due to (i) the extensive size of database catalogs (descriptions of tables and their columns) and database values, (ii) reasoning over large database schemas, (iii) ensuring the functional validity of the generated queries, and (iv) navigating the ambiguities of natural language questions. We introduce CHESS, a Large Language Model (LLM) based multi-agent framework for efficient and scalable SQL synthesis, comprising four specialized agents, each targeting one of the aforementioned challenges: the Information Retriever (IR) extracts relevant data, the Schema Selector (SS) prunes large schemas, the Candidate Generator (CG) generates high-quality candidates and refines queries iteratively, and the Unit Tester (UT) validates queries through LLM-based natural language unit tests. Our framework offers configurable features that adapt to various deployment constraints, including 1) Supporting industrial-scale databases: leveraging the Schema Selector agent, CHESS efficiently narrows down very large database schemas into manageable sub-schemas, boosting system accuracy by approximately $2\%$ and reducing the number of LLM tokens by $\times 5$. 2) State-of-the-Art privacy-preserving performance: Among the methods using open-source models, CHESS achieves state-of-the-art performance, resulting in a high-performing, privacy-preserving system suitable for industrial deployment. 3) Scalablity with additional compute budget: In settings with high computational budgets, CHESS achieves $71.10\%$ accuracy on the BIRD test set, within $2\%$ of the leading proprietary method, while requiring approximately $83\%$ fewer LLM calls.

Shayan Talaei, Mohammadreza Pourreza, Yu-Chen Chang, Azalia Mirhoseini, Amin Saberi• 2024

Related benchmarks

Task	Dataset	Result
Text-to-SQL	BIRD (dev)	Execution Accuracy (EA)68.31	387
Text-to-SQL	Spider (test)	Execution Accuracy89.6	213
Text-to-SQL	Spider (dev)	EX79.2	147
Text-to-SQL	Spider	Exec Acc (All)83.3	139
Text-to-SQL	Spider 1.0 (test)	EM Acc (Overall)87.2	110
Text-to-SQL	Bird	Total Execution Accuracy62.14	68
Text-to-SQL	LogicCat	Exact Match19.21	58
Text-to-SQL	BIRD (test)	EX71.1	46
Text-to-SQL	Archer (dev)	Execution Accuracy35.54	36
Knowledge-intensive Claim Extraction and Information Retrieval	75-case (test)	Overall Score0.462	35

Showing 10 of 15 rows

Other info

Code

Follow for update

@wizwand_team Discord