SOLOIST: Building Task Bots at Scale with Transfer Learning and Machine Teaching
About
We present a new method SOLOIST that uses transfer learning and machine teaching to build task bots at scale. We parameterize classical modular task-oriented dialog systems using a Transformer-based auto-regressive language model, which subsumes different dialog modules into a single neural model. We pre-train, on heterogeneous dialog corpora, a task-grounded response generation model, which can generate dialog responses grounded in user goals and real-world knowledge for task completion. The pre-trained model can be efficiently adapted to accomplish new tasks with a handful of task-specific dialogs via machine teaching, where training samples are generated by human teachers interacting with the system. Experiments show that (i) SOLOIST creates new state-of-the-art on well-studied task-oriented dialog benchmarks, including CamRest676 and MultiWOZ; (ii) in the few-shot fine-tuning settings, SOLOIST significantly outperforms existing methods, and (iii) the use of machine teaching substantially reduces the labeling cost of fine-tuning. The pre-trained models and codes are available at https://aka.ms/soloist.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Intent Classification | Banking77 (test) | Accuracy93.8 | 151 | |
| Dialog State Tracking | MultiWOZ 2.1 (test) | Joint Goal Accuracy56.85 | 88 | |
| Dialogue State Tracking | MultiWOZ 2.1 (test) | Joint Goal Accuracy53.36 | 85 | |
| End-to-end task-oriented dialogue | MultiWOZ (test) | Task Success Rate79.3 | 68 | |
| Dialog State Tracking | MultiWOZ 2.0 (test) | Joint Goal Accuracy53.2 | 47 | |
| Task-oriented Dialogue | MultiWOZ 2.0 (test) | Inform Rate85.5 | 37 | |
| Slot Filling | Restaurant-8K | F1 Score98 | 32 | |
| Task-oriented Dialogue | MultiWOZ 2.2 (test) | Inform Rate82.3 | 23 | |
| End-to-end Dialogue Modelling | MultiWOZ 2.0 (test) | Inform Rate85.5 | 22 | |
| End-to-end task-oriented dialogue | MultiWOZ 2.0 (test) | Inform Accuracy85.5 | 22 |