Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MultiWOZ

Benchmarks

Task NameDataset NameSOTA ResultTrend
Dialogue State TrackingMultiWOZ 2.1 (test)
Joint Goal Accuracy60.76
105
Dialog State TrackingMultiWOZ 2.1 (test)
Joint Goal Accuracy79.2
88
Dialogue State TrackingMultiWOZ 2.2 (test)
Joint Goal Accuracy66.25
80
End-to-end task-oriented dialogueMultiWOZ (test)
Task Success Rate94
68
End-to-End Task-Oriented DialogueMultiWOZ 2.1 (test)
BLEU Score22.19
57
Dialog State TrackingMultiWOZ 2.0 (test)
Joint Goal Accuracy55.03
47
Dialogue State TrackingMultiWOZ 2.4 (test)
Joint Goal Acc78.2
45
Task-Oriented DialogueMultiWOZ 2.0 (test)
Inform Rate99.1
37
Dialogue State TrackingMultiWOZ 2.1
Joint Goal Accuracy60.61
36
Dialogue State TrackingMultiWOZ 2.0 (test)
Joint Goal Accuracy57.23
29
Response GenerationMultiWOZ (test)
BLEU Score35.1
27
Task-Oriented DialogueMultiWOZ 2.2 (test)
Inform Rate96.48
23
End-to-end Task-oriented DialogueMultiWOZ 2.0 (test)
Inform Accuracy97.5
22
End-to-end Dialogue ModellingMultiWOZ 2.0 (test)
Inform Rate95.4
22
Knowledge-grounded Dialog GenerationMultiWOZ
Win Rate98.7
20
Spoken Dialogue State TrackingMultiWOZ (test)
Joint Goal Acc32.4
17
Untargeted Dialogue State ExtractionMultiWOZ
State Precision42
15
Task-Oriented DialogueMultiWOZ 2.4 (test)
JGA43.8
15
User simulator goal alignmentMultiWOZ Challenge
Profile Success Rate (Prof.)80.1
14
Multi-domain DialogMultiWOZ
Success Rate85.3
13
Task-oriented dialogueMultiWOZ 2.0
Inform Rate91.8
13
Dialogue State TrackingMultiWOZ zero-shot 2.1
Attraction Accuracy38.86
11
End-to-end dialog modelingMultiWOZ 2.0
Inform Score94.3
11
Task-Focused DialogueMultiwoz
TSE Score0.7699
11
Task-Oriented DialogueMultiWOZ 2.1 (test)
Inform Rate99.62
11
Showing 25 of 71 rows