Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SGD

Benchmarks

Task NameDataset NameSOTA ResultTrend
Dialogue State TrackingSGD
JGA (Overall)55.01
24
Dialog State TrackingSGD 15 tasks CL
Avg JGA76.3
23
Natural Language GenerationSGD (test)
BLEU28.6
18
User Satisfaction EstimationSGD
Accuracy64.8
14
Task-oriented DialogueFewShotSGD unseen schemata (test)
BLEU28.76
13
Task-oriented DialogueFewShotSGD seen schemata (test)
BLEU29.28
13
Dialogue State TrackingSGD (test)
JGA86.5
11
Dialogue State TrackingSGD (train)
JGA55.99
9
Dialogue State TrackingSGD Messaging
JGA67.79
9
Dialog Structure InductionSGD (test)
Purity46.8
9
User Satisfaction EstimationSGD 5% training size (test)
Precision75.3
8
Dialogue State TrackingSGD Media
JGA76.2
7
Task-Oriented DialogueSGD 1.0 (test)
Inform Rate81.29
6
Structure InductionSGD Real (test)
AMI0.559
6
Hidden Representation LearningSGD Real (test)
Class-Balanced Acc (Full)66.3
6
Dialogue State TrackingSGD-X v1-v5 variants (test)
Joint Goal Acc (Original)86.4
6
Dialogue State TrackingSGD Music
JGA35.46
5
Dialogue State TrackingSGD Flights
Joint Goal Accuracy (JGA)30.57
5
Goal completionSGD (test)
Inform Rate50.4
5
Dialogue State TrackingSGD to MultiWoz (test)
Average JGA51.2
5
Dialogue State TrackingSGD All Domains (test)
Joint GA32.1
4
Dialogue State TrackingSGD Unseen Domains (test)
Joint GA24.4
4
Natural Language GenerationSGD (Overall)
Naturalness2.46
4
Natural Language GenerationSGD Seen domains
Naturalness2.48
4
Natural Language GenerationSGD Unseen domains
Naturalness2.46
4
Showing 25 of 37 rows