Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Learning End-to-End Goal-Oriented Dialog

About

Traditional dialog systems used in goal-oriented applications require a lot of domain-specific handcrafting, which hinders scaling up to new domains. End-to-end dialog systems, in which all components are trained from the dialogs themselves, escape this limitation. But the encouraging success recently obtained in chit-chat dialog may not carry over to goal-oriented settings. This paper proposes a testbed to break down the strengths and shortcomings of end-to-end dialog systems in goal-oriented applications. Set in the context of restaurant reservation, our tasks require manipulating sentences and symbols, so as to properly conduct conversations, issue API calls and use the outputs of such calls. We show that an end-to-end dialog system based on Memory Networks can reach promising, yet imperfect, performance and learn to perform non-trivial operations. We confirm those results by comparing our system to a hand-crafted slot-filling baseline on data from the second Dialog State Tracking Challenge (Henderson et al., 2014a). We show similar result patterns on data extracted from an online concierge service.

Antoine Bordes, Y-Lan Boureau, Jason Weston• 2016

Related benchmarks

TaskDatasetResultRank
DialogbAbI dialog 1.0 (OOV)
Avg Error Rate0.279
22
Dialog GenerationDSTC2 (test)
Accuracy (Response)33.3
10
Dialogue Response GenerationbAbI Dialogue Task 2
Per-response accuracy100
9
Dialogue Response GenerationbAbI Dialogue Task 3
Accuracy (Per-response)74.9
9
Dialogue Response GenerationbAbI Dialogue Task 4
Per-response Accuracy59.5
9
Dialogue Response GenerationbAbI Dialogue Task 4 OOV
Per-response Accuracy57.6
9
Dialogue Response GenerationbAbI Dialogue Task 2 OOV
Accuracy (Per-response)78.9
9
Dialogue Response GenerationbAbI Dialogue Task 1
Per-response Accuracy99.9
9
Dialogue Response GenerationbAbI Dialogue Task 5
Per-response Accuracy96.1
9
Dialogue Response GenerationbAbI Dialogue Task 1 OOV
Per-response Accuracy0.723
9
Showing 10 of 18 rows

Other info

Follow for update