Improved Deep Learning Baselines for Ubuntu Corpus Dialogs
About
This paper presents results of our experiments for the next utterance ranking on the Ubuntu Dialog Corpus -- the largest publicly available multi-turn dialog corpus. First, we use an in-house implementation of previously reported models to do an independent evaluation using the same data. Second, we evaluate the performances of various LSTMs, Bi-LSTMs and CNNs on the dataset. Third, we create an ensemble by averaging predictions of multiple models. The ensemble further improves the performance and it achieves a state-of-the-art result for the next utterance ranking on this dataset. Finally, we discuss our future plans using this corpus.
Rudolf Kadlec, Martin Schmid, Jan Kleindienst• 2015
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multi-turn Response Selection | Ubuntu Dialogue Corpus V1 (test) | R10@163.8 | 102 | |
| Response Selection | Douban Conversation Corpus (test) | MAP0.485 | 94 | |
| Response Selection | E-commerce (test) | Recall@1 (R10)0.365 | 81 | |
| Multi-turn Response Selection | E-commerce Dialogue Corpus (test) | R@1 (Top 10 Set)36.5 | 70 | |
| Multi-turn Response Selection | Douban Conversation Corpus | MAP0.485 | 67 | |
| Multi-turn Response Selection | Ubuntu Corpus | Recall@1 (R10)63.8 | 65 | |
| Response Selection | Ubuntu (test) | Recall@1 (Top 10)0.638 | 58 | |
| Response Ranking | Ubuntu Dialog Corpus v1 (test) | Recall@1 (1/2)91.5 | 16 | |
| Multi-turn Response Selection | E-commerce | R@136.5 | 14 | |
| Answer Ranking | Ubuntu v2 (test) | Recall@1 (1/2 Pool)86.9 | 11 |
Showing 10 of 10 rows