| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| CoSQL & SParC Aggregate | Avg Exact Match (EM)66.6 | 29 | 1mo ago | ||
| SParC In-domain | Exact Match71.5 | 29 | 1mo ago | ||
| CoSQL In-domain | [Long-Horizon] Warm-Start SFT + RL (Outcome + Process) | Exact Match65.2 | 29 | 1mo ago | |
| SparC (dev) | gpt-oss-20b + Rose-SQL | QM EM68.5 | 27 | 28d ago | |
| CoSQL (dev) | gpt-oss-20b + Rose-SQL | Query-level EM63.98 | 16 | 28d ago | |
| SParC Out-of-domain | [Long-Horizon] Warm-Start SFT + RL (Outcome + Process) | Execution Accuracy (EX)77.4 | 16 | 1mo ago | |
| CoSQL Out-of-domain | [Long-Horizon] Warm-Start SFT + RL (Outcome Only) | Execution Accuracy74 | 16 | 1mo ago | |
| MMSQL (test) | TDEX67 | 8 | 3mo ago | ||
| SParC (ADVETA-ADD) | IGSQL | EM42.9 | 2 | 3mo ago | |
| SParC (ADVETA-RPL) | IGSQL | Exact Match (EM)34.2 | 2 | 3mo ago | |
| SParC original (dev) | IGSQL | Exact Match (EM)50.7 | 2 | 3mo ago | |
| CoSQL ADVETA-ADD | IGSQL | Exact Match (EM)32.8 | 2 | 3mo ago | |
| CoSQL (ADVETA-RPL) | IGSQL | Exact Match (EM)16.4 | 2 | 3mo ago | |
| CoSQL original (dev) | IGSQL | Exact Match (EM)44.1 | 2 | 3mo ago |