Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Dialogue Response Generation on P4G (test)
Loading...
89
Accuracy
Mixed-Initiative Dialogue Prompting
82.76
84.38
86
87.62
May 6, 2023
Accuracy
Coherence
Consistency
Engagingness
Distinct-3
Distinct-4
QuantiDCE
Win Rate v. FT
Win Rate v. GT
Win Rate v. Prompt
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Coherence
Consistency
Engagingness
Distinct-3
Distinct-4
QuantiDCE
Win Rate v. FT
Win Rate v. GT
Win Rate v. Prompt
Mixed-Initiative Dialogue Prompting
Strategy=Prompt, Backb...
2023.05
89
3.83
3.71
3.69
89
88
3.24
59
55
-
RAP
Strategy=Fine-tuning (FT)
2023.05
88
3.66
3.69
3.62
87
88
3.16
-
48
41
Ground Truth
Strategy=GT
2023.05
83
3.58
3.56
3.52
88
88
3.09
56
-
45
Feedback
Search any
task
Search any
task