Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ConvAI2

Benchmarks

Task NameDataset NameSOTA ResultTrend
Response SelectionConvAI2 (dev)
R@1/2090.3
25
Dialogue GenerationCONVAI2
BLEU3.68
24
Response SelectionConvAI2 (test)
R@2087.9
16
Persona-based DialogueConvAI2 (test)
Hits@190.68
10
Open domain dialogueConvAI2
RSR38.6
9
Dialogue RetrievalConvAI2
R@190.7
9
Personalized Dialogue GenerationConvAI2 (Human Evaluation)
Readability80
8
Open-domain ConversationNO-ConvAI2 NLEBench (test)
BLEU4.28
7
Personalized Dialogue GenerationConvAI2
BLEU-111.85
7
Red TeamingConvAI2 (filtered hard positive)
RSR2,120
7
Open-domain dialogue red teamingConvAI2 filtered (test)
RSR16.9
7
Persona-based Dialogue GenerationConvAI2
Coherence2.27
5
Dialogue EvaluationConvAI2 (C2)
Perplexity10.2
4
Dialogue GenerationConvAI2 (val)
F1 Score21.7
4
Attribute-Controlled Dialogue GenerationConvAI2-CG (test)
Persona Consistency2.17
3
Red TeamingConvAI2 (test)
P Score186
3
Dialogue Response GenerationConvAI2 (val)
F1 Score20.72
3
Dialogue GenerationConvAI2 (test)
Persona Consistency1.89
2
Showing 18 of 18 rows