Improved Deep Learning Baselines for Ubuntu Corpus Dialogs

About

This paper presents results of our experiments for the next utterance ranking on the Ubuntu Dialog Corpus -- the largest publicly available multi-turn dialog corpus. First, we use an in-house implementation of previously reported models to do an independent evaluation using the same data. Second, we evaluate the performances of various LSTMs, Bi-LSTMs and CNNs on the dataset. Third, we create an ensemble by averaging predictions of multiple models. The ensemble further improves the performance and it achieves a state-of-the-art result for the next utterance ranking on this dataset. Finally, we discuss our future plans using this corpus.

Rudolf Kadlec, Martin Schmid, Jan Kleindienst• 2015

Related benchmarks

Task	Dataset	Result
Multi-turn Response Selection	Ubuntu Dialogue Corpus V1 (test)	R10@163.8	102
Response Selection	Douban Conversation Corpus (test)	MAP0.485	94
Response Selection	E-commerce (test)	Recall@1 (R10)0.365	81
Multi-turn Response Selection	E-commerce Dialogue Corpus (test)	R@1 (Top 10 Set)36.5	70
Multi-turn Response Selection	Douban Conversation Corpus	MAP0.485	67
Multi-turn Response Selection	Ubuntu Corpus	Recall@1 (R10)63.8	65
Response Selection	Ubuntu (test)	Recall@1 (Top 10)0.638	58
Response Ranking	Ubuntu Dialog Corpus v1 (test)	Recall@1 (1/2)91.5	16
Multi-turn Response Selection	E-commerce	R@136.5	14
Answer Ranking	Ubuntu v2 (test)	Recall@1 (1/2 Pool)86.9	11

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord