| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| DART (test) | C-P (large) | BLEU52 | 42 | 1mo ago | |
| E2E (test) | Transformer | BLEU68.23 | 39 | 1mo ago | |
| WebNLG (test) | KGPT-Seq | BLEU64.11 | 39 | 1mo ago | |
| E2E | Prefix | ROUGE-L0.717 | 36 | 1mo ago | |
| MLB (test) | Templ | RG Precision99.9 | 22 | 1mo ago | |
| RotoWire (test) | Templ | Factual Support Score7.57 | 19 | 1mo ago | |
| ToTTo | Re-Table-7B-rerank | BLEU52.28 | 18 | 1mo ago | |
| WikiBio (test) | BLEU45.14 | 17 | 1mo ago | ||
| DART | BLEU51.8 | 16 | 1mo ago | ||
| ToTTo full (test) | T5-3B | BLEU50.8 | 12 | 1mo ago | |
| WebNLG en | PEGASUS-2B (calibrated) | ROUGE-255.52 | 12 | 1mo ago | |
| ROTOWIRE English (test) | RG Score61 | 12 | 1mo ago | ||
| ROTOWIRE (dev) | TEMPL | RG Score0.5429 | 12 | 1mo ago | |
| DART | GENICL | ROUGE-L56.4 | 9 | 1mo ago | |
| Cleaned E2E (test) | CONTROL PREFIXES (A2) | BLEU44.15 | 9 | 1mo ago | |
| SPNLG (test) | Table2seq-beam | BLEU40.61 | 9 | 1mo ago | |
| WebNLG Unseen v1 | Fluency Score5.63 | 9 | 1mo ago | ||
| WebNLG Seen v1 | E2E GRU | BLEU57.2 | 9 | 1mo ago | |
| WebNLG v1 (All) | Transformer | BLEU51.68 | 9 | 1mo ago | |
| ZebraLogic | Schema Validity100 | 8 | 4d ago | ||
| DART Open | Schema Validity100 | 8 | 4d ago | ||
| WebNLG Standard | Schema Validity100 | 8 | 4d ago | ||
| E2E | SMOOTHIE-GLOBAL | ROUGE-237.6 | 8 | 1mo ago | |
| CommonGen | Se² | ROUGE-L34.6 | 8 | 1mo ago | |
| German ROTOWIRE (DE-RW) (test) | Templ | RG Score54.4 | 8 | 1mo ago |