| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Knowledge-grounded Dialog Generation | Wizard of Wikipedia (WoW) | Win Rate97.7 | 20 | |
| Dialogue Generation | Wizard of Wikipedia (WoW) (dev) | F1 Score16.4 | 19 | |
| Dialogue Generation | Wizard of Wikipedia (WoW) seen (test) | BLEU-127.29 | 13 | |
| Knowledge-grounded Dialogue Generation | Wizard of Wikipedia unseen (test) | BLEU-127.68 | 11 | |
| Knowledge-grounded Dialogue Generation | WoW (Wizard of Wikipedia) unseen (test) | ROUGE-120.7 | 10 | |
| Knowledge-Grounded Dialogue Generation | Wizard of Wikipedia (WoW) Seen (test) | ROUGE-121.7 | 10 | |
| Knowledge-grounded dialog | Wizard-of-Wikipedia (WoW) (test) | BLEU100 | 9 | |
| Dialog | WoW (Wizard of Wikipedia) (test) | F1 Score11.38 | 8 | |
| Open-domain dialogue | Wizard-of-Wikipedia KILT (test) | F1 Score15.78 | 8 | |
| Fine-grained passage-level retrieval | WoW (Wizard of Wikipedia) | Entity in Context63.43 | 7 | |
| Dialogue Evaluation | Wizard of Wikipedia (WW) | Perplexity12.4 | 4 | |
| Dialogue Generation | Wizard of Wikipedia (WoW) (test seen) | Relevance Win Rate41.34 | 1 |