Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Dialogue Response Generation on ESC (test)
Loading...
88
Accuracy
Mixed-Initiative Dialogue Prompting
80.72
82.61
84.5
86.39
May 6, 2023
Accuracy
Coherence
Consistency
Engagingness
Distinct-3
Distinct-4
QuantiDCE
Win Rate v. FT
Win Rate v. GT
Win Rate v. Prompt
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Coherence
Consistency
Engagingness
Distinct-3
Distinct-4
QuantiDCE
Win Rate v. FT
Win Rate v. GT
Win Rate v. Prompt
Mixed-Initiative Dialogue Prompting
Strategy=Prompt, Backb...
2023.05
88
3.72
3.8
3.81
90
91
3.19
0.52
64
-
Ground Truth
Strategy=GT
2023.05
85
3.57
3.6
3.61
90
90
3.03
0.56
-
36
Oracle-BlenderBot
Strategy=Fine-tuning (FT)
2023.05
81
3.57
3.63
3.55
89
87
3.25
-
44
48
Feedback
Search any
task
Search any
task