Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

E-SQL: Direct Schema Linking via Question Enrichment in Text-to-SQL

About

Translating Natural Language Queries into Structured Query Language (Text-to-SQL or NLQ-to-SQL) is a critical task extensively studied by both the natural language processing and database communities, aimed at providing a natural language interface to databases (NLIDB) and lowering the barrier for non-experts. Despite recent advancements made through the use of Large Language Models (LLMs), significant challenges remain. These include handling complex database schemas, resolving ambiguity in user queries, and generating SQL queries with intricate structures that accurately reflect the user's intent. In this work, we introduce E-SQL, a novel pipeline specifically designed to address these challenges through direct schema linking and candidate predicate augmentation. E-SQL enhances the natural language query by incorporating relevant database items (i.e., tables, columns, and values) and conditions directly into the question and SQL construction plan, bridging the gap between the query and the database structure. The pipeline leverages candidate predicate augmentation to mitigate erroneous or incomplete predicates in generated SQLs. Comprehensive evaluations on the BIRD benchmark illustrate that E-SQL achieves competitive performance, particularly excelling in complex queries with a 66.29% execution accuracy on the test set. A further observation from our experiments reveals that incorporating schema filtering into the translation pipeline does not have a positive impact on performance when the most advanced proprietary LLMs are used. Additionally, our experiments with small LLMs highlight the importance and positive impact of enriched questions on their performance. Without fine-tuning, single-prompt SQL generation using enriched questions with DeepSeek Coder 7B Instruct 1.5v achieves 56.45% execution accuracy on the BIRD development set.

Hasan Alp Cafero\u{g}lu, \"Ozg\"ur Ulusoy• 2024

Related benchmarks

TaskDatasetResultRank
Text-to-SQLBIRD (dev)
Execution Accuracy (EA)66.29
387
Text-to-SQLSpider (dev)
EX64.65
147
Text-to-SQLSpider
Exec Acc (All)75.63
139
Text-to-SQLBird
Execution Accuracy (EX)59.1
63
Text-to-SQLBIRD-SQL Mini (dev)
Execution Accuracy (EX)57.4
21
Text-to-SQLBird
Execution Accuracy59.1
20
Text-to-SQLMini (dev)
Execution Accuracy (EX)57.4
9
Text-to-SQLSpider lite 2.0 (Random 100 examples)
Execution Accuracy (EX)21
4
Showing 8 of 8 rows

Other info

Follow for update