| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| CFQ MCD3 | Dynamic Least-to-Most Prompting | Accuracy95.5 | 33 | 1mo ago | |
| CFQ (MCD2) | Dynamic Least-to-Most Prompting | Accuracy95.3 | 33 | 1mo ago | |
| CFQ (MCD1) | Grounded Graph Decoding | Accuracy98.6 | 33 | 1mo ago | |
| GeoQuery (i.i.d.) | T5 11B (Prompt Tuning) | Exact Match Accuracy93.6 | 32 | 1mo ago | |
| GeoQuery compositional | SpanSub | Accuracy89.5 | 29 | 1mo ago | |
| WikiSQL (test) | TypeSQL+TC | Execution Accuracy82.6 | 27 | 1mo ago | |
| OVERNIGHT v1.0 (test) | TWO-STAGE | Blocks Domain Score65.7 | 26 | 1mo ago | |
| GEO | COARSE2FINE | Accuracy0.939 | 26 | 1mo ago | |
| COGS (generalization) | Lear | Accuracy (Generalization)99 | 25 | 1mo ago | |
| SMCalFlow | Grammar Prompting (w. oracle grammar) | Program Accuracy88.9 | 22 | 1mo ago | |
| CFQ MCD avg | Dynamic Least-to-Most Prompting | Exact Match Accuracy95 | 22 | 1mo ago | |
| SMCalFlow-CS (16-C) | Cover-LS (Oracle) | Accuracy73.5 | 20 | 1mo ago | |
| ATIS | COARSE2FINE | Accuracy95.1 | 19 | 1mo ago | |
| Break | CLG | Accuracy42.21 | 18 | 1mo ago | |
| GeoQuery (Len.) | DPP | Exact Match Accuracy74.3 | 17 | 1mo ago | |
| mTOP (test) | POS-based code-switching | Average Score91.8 | 17 | 1mo ago | |
| WIKITABLEQUESTIONS (test) | TaBERT-Large | Execution Accuracy (Best)52.3 | 17 | 1mo ago | |
| COGS (test) | SpanSub+L2S2 | Exact Match Accuracy92.3 | 16 | 1mo ago | |
| SCAN Around Right | SpanSub+L2S2 | Exact-match Accuracy100 | 16 | 1mo ago | |
| WIKITABLEQUESTIONS (dev) | TaBERT-Large | Execution Accuracy (Best)53 | 16 | 1mo ago | |
| CFQ MCD3 (test) | LIRind+RIR | Accuracy77.9 | 15 | 1mo ago | |
| CFQ MCD2 (test) | LIRind+RIR | Accuracy0.853 | 15 | 1mo ago | |
| CFQ MCD1 (test) | RIR | Accuracy88.7 | 15 | 1mo ago | |
| NL2Bash | CLG | Accuracy45.19 | 14 | 1mo ago | |
| Geoquery (test) | DCS+L | Accuracy91.1 | 14 | 1mo ago |