Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Wizard of Wikipedia

Benchmarks

Task NameDataset NameSOTA ResultTrend
Knowledge-grounded Dialog GenerationWizard of Wikipedia (WoW)
Win Rate97.7
20
Dialogue GenerationWizard of Wikipedia (WoW) (dev)
F1 Score16.4
19
Dialogue GenerationWizard of Wikipedia (WoW) seen (test)
BLEU-127.29
13
Knowledge-grounded Dialogue GenerationWizard of Wikipedia unseen (test)
BLEU-127.68
11
Knowledge-grounded Dialogue GenerationWoW (Wizard of Wikipedia) unseen (test)
ROUGE-120.7
10
Knowledge-Grounded Dialogue GenerationWizard of Wikipedia (WoW) Seen (test)
ROUGE-121.7
10
Knowledge-grounded dialogWizard-of-Wikipedia (WoW) (test)
BLEU100
9
DialogWoW (Wizard of Wikipedia) (test)
F1 Score11.38
8
Open-domain dialogueWizard-of-Wikipedia KILT (test)
F1 Score15.78
8
Fine-grained passage-level retrievalWoW (Wizard of Wikipedia)
Entity in Context63.43
7
Dialogue EvaluationWizard of Wikipedia (WW)
Perplexity12.4
4
Dialogue GenerationWizard of Wikipedia (WoW) (test seen)
Relevance Win Rate41.34
1
Showing 12 of 12 rows