| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Table Question Answering | WikiTQ (test) | Accuracy84.3 | 130 | |
| Table Question Answering | WikiTQ | Accuracy91.84 | 118 | |
| Table Question Answering | WikiTQ | F1 Score79.8 | 50 | |
| Out-of-Distribution Agent Reasoning | WikiTQ (OOD) | OOD WikiTQ Score74.68 | 30 | |
| Table Question Answering | WikiTQ | Accuracy77.2 | 29 | |
| Table Question Answering | WikiTQ (dev) | Denotation Acc58.3 | 18 | |
| NL-to-SQL | WikiTQ | Execution Accuracy59.7 | 12 | |
| Language-to-Code Generation | WikiTQ official (test) | Execution Accuracy74.6 | 12 | |
| Table Reasoning | WikiTQ | Exact Match (EM)79.4 | 11 | |
| Language-to-Code Generation | WikiTQ official (dev) | Execution Accuracy70.9 | 11 | |
| Table Question Answering | WikiTQ (challenge-set) | F1 Score74.41 | 10 | |
| Table Reasoning | WikiTQ (T) | Accuracy65.56 | 10 | |
| Table Reasoning | WikiTQ (D) | Accuracy0.651 | 10 | |
| Table Question Answering | WikiTQ Large (>4000 tokens) | Accuracy57 | 8 | |
| NL-to-Scala | WikiTQ | Execution Accuracy47.6 | 6 | |
| Table Question Answering | WikiTQ | BLEU63.19 | 5 |