Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Dialogue Generation on Wizard of Wikipedia (WoW) (test seen)
Loading...
41.34
Relevance Win Rate
CKL
39.273
40.3065
41.34
42.3735
May 29, 2023
Relevance Win Rate
Relevance Loss Rate
Relevance Tie Rate
Relevance Kappa
Coherence Win Rate
Coherence Loss Rate
Coherence Tie Rate
Coherence Kappa
Informativeness Win Rate
Informativeness Loss Rate
Informativeness Tie Rate
Informativeness Kappa
Overall Win Rate
Overall Loss Rate
Overall Tie Rate
Overall Kappa
Updated 4d ago
Evaluation Results
Method
Method
Links
Relevance Win Rate
Relevance Loss Rate
Relevance Tie Rate
Relevance Kappa
Coherence Win Rate
Coherence Loss Rate
Coherence Tie Rate
Coherence Kappa
Informativeness Win Rate
Informativeness Loss Rate
Informativeness Tie Rate
Informativeness Kappa
Overall Win Rate
Overall Loss Rate
Overall Tie Rate
Overall Kappa
CKL
Comparison Model=DIALKI
2023.05
41.34
25.25
33.41
0.3
42.66
24.26
33.08
0.33
43.23
24.26
32.51
0.31
36.3
22.28
41.42
0.39
Feedback
Search any
task
Search any
task