EviLink: Multi-Path Schema Linking with Uncertainty-Guided Evidence Acquisition for Large-Scale Text-to-SQL
About
Schema linking is a difficult and important step in large-scale Text-to-SQL, where systems must identify a compact yet sufficient schema context from large and ambiguous databases. Existing methods often treat schema linking as deterministic selection around a single SQL path, but complex questions may admit multiple valid realizations with different schema needs. We reframe schema linking as uncertainty-aware schema-need inference over multiple plausible SQL paths, where the system distinguishes required schema items from path-dependent uncertain ones and acquires evidence only where needed. We instantiate this reframing with EviLink, which combines multi-hypothesis schema grounding with uncertainty-guided evidence acquisition. Experiments on BIRD-Dev and Spider2-Snow show that this perspective improves the balance among schema completeness, schema relevance, and token cost. On Spider2-Snow, EviLink achieves 90.15% field-level strict recall rate, uses 123.30K average tokens, and improves downstream SQL generation under a fixed generator.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Schema linking | BIRD (dev) | SRR93.68 | 16 | |
| Text-to-SQL | Spider2-Snow 80-case | Execution Accuracy43.75 | 9 | |
| Schema linking | Spider2-Snow | SRR90.15 | 8 | |
| Table-level schema linking | Spider2-Snow (test) | SRR95.75 | 8 |