Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Dialogue Response Generation on 100 randomly sampled conversational pairs (test)
Loading...
66.1
Appropriateness
SaBART
32.612
41.306
50
58.694
Jun 28, 2023
Appropriateness
Informativeness
Fleiss' Kappa
Updated 4d ago
Evaluation Results
Method
Method
Links
Appropriateness
Informativeness
Fleiss' Kappa
SaBART
Comparison Partner=Con...
2023.06
66.1
70.2
-
SaBART
Comparison Partner=SaB...
2023.06
62.3
60
-
SaBART
Comparison Partner=SaB...
2023.06
61.9
65
-
SaBART - w/o dy-agg
Comparison Partner=SaBART
2023.06
38.1
35
-
SaBART - w/o st-agg
Comparison Partner=SaBART
2023.06
37.7
40
-
ConceptFlow
Comparison Partner=SaBART
2023.06
33.9
29.8
-
Feedback
Search any
task
Search any
task