Our new X account is live! Follow @wizwand_team for updates
Search any
task
Feedback
Search any
task
SOTA Dialogue Generation benchmarks and papers with code | Wizwand
Our new X account is live! Follow @wizwand_team for updates
Home
/
Tasks
Dialogue Generation
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
PersonaChat (test)
LMEDR
Persona Consistency
2.31
27
4d ago
DailyDialog
BART joint† (D)
Distinct-1
9.12
26
4d ago
CONVAI2
SMoA(r=32,n=2)
BLEU
3.68
24
2d ago
Douban (test)
Ours
BLEU-1
0.1398
20
4d ago
Wizard of Wikipedia (WoW) (dev)
KID
F1 Score
16.4
19
4d ago
Proposed Multi-scenario Dataset 1.0 (test)
SGM
Acc T
86.37
18
4d ago
Anthropic-HH (test)
Cal-DPO
Average Preference Score
69.07
16
4d ago
DailyDialog Multi-reference
DialoGPS
BLEU-1
38.46
16
4d ago
TG-ReDial
TREA
BLEU-2
5
16
4d ago
4 dialogue datasets Aggregate (test val)
OPT
Dialogue Avg F1
12.9
15
4d ago
CausalDialogue (test)
Human Written Responses
PPL
1.2
13
4d ago
CMU-DoG (test)
CKL
BLEU-1
17.74
13
4d ago
Wizard of Wikipedia (WoW) seen (test)
CKL
BLEU-1
27.29
13
4d ago
Reddit Conversation Corpus (test)
DialoGPT
PPL
36.03
13
4d ago
PERSONA-CHAT Original (dev)
LMEDR
Hits@1
89.5
13
4d ago
Syn. Persona
LongGuide
ROUGE-L
22.98
12
3d ago
PERSONA-CHAT Revised (dev)
LMEDR
Hits@1
85
11
4d ago
E2E
BOMF
BLEU
64.81
10
4d ago
Commonsense Dialogue Dataset (test)
SaBART
Dist-1
0.0616
10
4d ago
Commonsense Dialogue (CD)
Teacher
BLEU-1
11.67
9
4d ago
Human-in-the-loop Interactive Evaluation Customer-Agent Dialogs
GER
Win Rate (vs GPT-4)
41
8
3d ago
full-hh-rlhf (test)
ReMax+XRLHF
Win Rate (Beaver-7b-v3.0-reward)
79.3
8
3d ago
LCCC (test)
Temperature (DDS)
Distinct-1
0.4408
8
4d ago
LQA (test)
Top-p (DDS)
BLEU-1
0.0927
8
4d ago
PersonaChat
DialoGPS
BLEU-1
19.05
8
4d ago
Showing 25 of 59 rows
25 / page
50 / page
100 / page
1
2
3
Search any
task
Search any
task
Terms of Service
FAQs