DTS-SQL: Decomposed Text-to-SQL with Small Large Language Models
About
Leading models for the text-to-SQL task heavily rely on proprietary Large Language Models (LLMs), posing concerns over data privacy. Closing the performance gap between small open-source models and large proprietary models is crucial to mitigate this reliance. To this end, we introduce a novel two-stage fine-tuning approach that decomposes the task into two simpler tasks. Through comprehensive evaluation on two large cross-domain datasets and two small LLMs, we show that this approach improves execution accuracy by 3 to 7 percent, effectively aligning the performance of open-source models with their proprietary counterparts.
Mohammadreza Pourreza, Davood Rafiei• 2024
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Text-to-SQL | BIRD (dev) | Execution Accuracy (EA)61.56 | 217 | |
| Text-to-SQL | Spider (test) | Execution Accuracy82.8 | 140 | |
| Text-to-SQL | Spider (dev) | EX (All)82.7 | 100 | |
| Text-to-SQL | Spider 1.0 (test) | EM Acc (Overall)77 | 91 | |
| Text-to-SQL | LogicCat | Exact Match14.88 | 58 | |
| Text-to-SQL | Spider | Exec Acc (All)85.09 | 57 | |
| Text-to-SQL | Archer (dev) | Execution Accuracy33.17 | 36 |
Showing 7 of 7 rows