Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PersonaChat

Benchmarks

Task NameDataset NameSOTA ResultTrend
Dialogue GenerationPersonaChat (test)
Persona Consistency2.31
27
Turn-level correlation with human Overall Quality ratingsPersonaChat turn-level
Spearman Correlation0.4814
20
Personalized Dialogue GenerationPersonaChat (Human Evaluation)
Fluency3.58
16
Persona Adherence AlignmentPersonaChat 1.0 (test)
Similarity100
11
Persona Simulation Naturalness EvaluationPersonaChat (test)
CS (Coherence Score)0.718
11
Authorship VerificationPersonaChat original (test)
F1 Score67.1
11
Persona SimulationPersonaChat
Adherence Score100
11
Dialogue Policy EvaluationPersonaChat (test)
USR RET97.7
10
Social InclusionPersonaChat
Diversity41.52
9
Dialogue GenerationPersonaChat
BLEU-119.05
8
Persona-based Dialogue GenerationPersonaChat (full)
Perplexity7.8
6
Dialogue CoherencePersonaChat
QuantiDCE3.03
5
Machine UnlearningPersonaChat (Forget Set)
PDLP100
4
Machine UnlearningPersonaChat (test)
PDLP80
4
Dialogue EvaluationPersonaChat
USR RET97.7
4
Response SelectionPersonaChat (test)
R@1 (R20 Context)86.9
3
Dialogue GenerationPersonaChat
Metric-
0
Showing 17 of 17 rows