TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue
About
The underlying difference of linguistic patterns between general text and task-oriented dialogue makes existing pre-trained language models less useful in practice. In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling. To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling. We propose a contrastive objective function to simulate the response selection task. Our pre-trained task-oriented dialogue BERT (TOD-BERT) outperforms strong baselines like BERT on four downstream task-oriented dialogue applications, including intention recognition, dialogue state tracking, dialogue act prediction, and response selection. We also show that TOD-BERT has a stronger few-shot ability that can mitigate the data scarcity problem for task-oriented dialogue.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Dialogue State Tracking | MultiWOZ 2.1 (test) | Joint Goal Accuracy48 | 105 | |
| Dialog State Tracking | MultiWOZ 2.1 (test) | Joint Goal Accuracy48 | 88 | |
| Dialogue Segmentation | DialSeg711 | Pk0.11 | 44 | |
| Dialogue Segmentation | TIAGE | Pk0.365 | 39 | |
| Dialogue Topic Segmentation | Doc2Dial | Pk27.5 | 34 | |
| Dialogue Topic Segmentation | SuperSeg | Pk Score20 | 28 | |
| Dialogue Topic Segmentation | VHF | Pk Score8.5 | 25 | |
| Intent Classification | HINT3 10-shot | Accuracy66.42 | 23 | |
| Intent Classification | MCID 10-shot | Accuracy74.66 | 23 | |
| Intent Classification | HINT3 5-shot | Accuracy56.33 | 23 |