| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Text-to-SQL | Spider (test) | Execution Accuracy91.2 | 213 | |
| Text-to-SQL | Spider (dev) | EX90.3 | 147 | |
| Text-to-SQL | Spider | Exec Acc (All)98.65 | 139 | |
| Text-to-SQL | Spider 1.0 (test) | EM Acc (Overall)91.2 | 110 | |
| Text-to-SQL | Spider-DK | Execution Accuracy (EX)81.9 | 95 | |
| Text-to-SQL | Spider 1.0 (dev) | Exact Match Accuracy86.75 | 92 | |
| Text-to-SQL | Spider-Syn | Execution Accuracy (EX)83.1 | 79 | |
| Dataset-level accuracy estimation | Spider to SynSQL 2.5M | MAE2.8 | 54 | |
| Dataset-level accuracy estimation | Spider to BIRD | MAE3.1 | 54 | |
| Text-to-SQL | Spider-Realistic | Execution Accuracy (EX)88.6 | 47 | |
| Text-to-SQL | Spider | Attack Accuracy41.73 | 40 | |
| Text-to-SQL | Spider Realistic | Execution Accuracy (EX)88.6 | 39 | |
| Text2SQL | Spider (test) | Exec Acc (Greedy)88.3 | 37 | |
| Text-to-SQL | Spider Non-Synthesized Matched Set | Execution Match Accuracy (ExM)97.31 | 32 | |
| text-to-SQL | Spider | Accuracy82.5 | 28 | |
| Text-to-SQL | Spider 2.0 (test) | Execution Accuracy (Spider 2.0 Test)26.7 | 27 | |
| Text-to-SQL | Spider Lite 2.0 | Execution Accuracy (EX)49.18 | 24 | |
| SQL Semantic Validation | Spider | AUPRC63.21 | 24 | |
| Text-to-SQL | Spider | AVGLEN1.07 | 22 | |
| Text-to-SQL | Spider 2.0 | Llama-8B Execution Accuracy4.4 | 21 | |
| Text-to-SQL | Spider no-easy | Accuracy61.4 | 20 | |
| Table Retrieval | Spider Lite 2.0 (test) | Precision29.4 | 20 | |
| Table Retrieval | Spider union (test) | Precision (P)69.6 | 20 | |
| Schema retrieval | Spider | Recall100 | 19 | |
| Semantic Validation | Spider 2.0 | AUPRC92.59 | 18 |