Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DSTC

Benchmarks

Task NameDataset NameSOTA ResultTrend
Response SelectionDSTC7 Track 1 (test)
Recall@1 (Top 100)91.1
27
Dialogue EvaluationDSTC9 Interactive Dialogue Evaluation Track (test)
Human Rating5
12
Response GenerationDSTC7 Shared Task (test)
NIST-42.669
8
DialogDSTC2 (test)
Average Error Rate48.9
7
Theme DetectionDSTC Travel domain 12 (test)
Semantic Relevance (SR)89.7
6
Conversation-level Topic Discovery and LabelingDSTC-12 Travel domain (blind test)
Accuracy0.68
6
Dialogue State TrackingDSTC2
Joint GA85
5
Dialog act predictionDSTC 4 (test)
Accuracy66.2
4
Multi-intent Natural Language UnderstandingDSTC4
Slot F161.1
3
Task-oriented dialogueDSTC9 shared task Human Evaluation (test)
Avg Success Rate74.8
3
Showing 10 of 10 rows