Key-Value Retrieval Networks for Task-Oriented Dialogue
About
Neural task-oriented dialogue systems often struggle to smoothly interface with a knowledge base. In this work, we seek to address this problem by proposing a new neural dialogue agent that is able to effectively sustain grounded, multi-domain discourse through a novel key-value retrieval mechanism. The model is end-to-end differentiable and does not need to explicitly model dialogue state or belief trackers. We also release a new dataset of 3,031 dialogues that are grounded through underlying knowledge bases and span three distinct tasks in the in-car personal assistant space: calendar scheduling, weather information retrieval, and point-of-interest navigation. Our architecture is simultaneously trained on data from all domains and significantly outperforms a competitive rule-based system and other existing neural dialogue architectures on the provided domains according to both automatic and human evaluation metrics.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Task-oriented Dialogue | Stanford Multi-Domain Dialogue (SMD) (test) | BLEU13.2 | 29 | |
| Task-oriented Dialog Generation | In-Car Assistant (test) | BLEU13.5 | 9 | |
| Dialogue Response Generation | In-car dialogue dataset (test) | BLEU13.2 | 6 | |
| Task-oriented Dialogue | In-car personal assistant dataset realtime dialogues | Fluency3.36 | 4 | |
| Task-oriented Dialogue Response Generation | Stanford Multi-turn Multi-domain Task-oriented Dialogue Dataset Navigation (test) | BLEU8.7 | 4 | |
| Task-oriented Dialogue Response Generation | Stanford Multi-turn Multi-domain Task-oriented Dialogue Dataset Weather SMD (test) | BLEU12.4 | 4 | |
| Dialogue Generation | Navigation (test) | Correctness3.61 | 3 | |
| Task-oriented Dialogue | In-car personal assistant dialogue dataset (test) | Correctness3.7 | 3 |