DSTC

Benchmarks

Task Name	Dataset Name	SOTA Result
Response Selection	DSTC7 Track 1 (test)	Recall@1 (Top 100)91.1	27
Dialogue Evaluation	DSTC9 Interactive Dialogue Evaluation Track (test)	Human Rating5	12
Response Generation	DSTC7 Shared Task (test)	NIST-42.669	8
Dialog	DSTC2 (test)	Average Error Rate48.9	7
Theme Detection	DSTC Travel domain 12 (test)	Semantic Relevance (SR)89.7	6
Conversation-level Topic Discovery and Labeling	DSTC-12 Travel domain (blind test)	Accuracy0.68	6
Dialogue response evaluation	DSTC 12 (test)	Empathy0.17	5
Dialogue State Tracking	DSTC2	Joint GA85	5
Dialog act prediction	DSTC 4 (test)	Accuracy66.2	4
Multi-intent Natural Language Understanding	DSTC4	Slot F161.1	3
Task-oriented dialogue	DSTC9 shared task Human Evaluation (test)	Avg Success Rate74.8	3

Showing 11 of 11 rows