Continual Learning in Task-Oriented Dialogue Systems
About
Continual learning in task-oriented dialogue systems can allow us to add new domains and functionalities through time without incurring the high cost of a whole system retraining. In this paper, we propose a continual learning benchmark for task-oriented dialogue systems with 37 domains to be learned continuously in four settings, such as intent recognition, state tracking, natural language generation, and end-to-end. Moreover, we implement and compare multiple existing continual learning baselines, and we propose a simple yet effective architectural method based on residual adapters. Our experiments demonstrate that the proposed architectural method and a simple replay-based strategy perform comparably well but they both achieve inferior performance to the multi-task learning baseline, in where all the data are shown at once, showing that continual learning in task-oriented dialogue systems is a challenging task. Furthermore, we reveal several trade-offs between different continual learning methods in term of parameter usage and memory size, which are important in the design of a task-oriented dialogue system. The proposed benchmark is released together with several baselines to promote more research in this direction.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mathematical Reasoning | MATH | Accuracy8.04 | 643 | |
| Reasoning | BBH | Accuracy28.8 | 507 | |
| Mathematical Reasoning | SVAMP | Accuracy41.6 | 368 | |
| Logical reasoning | LogiQA | Accuracy34.41 | 98 | |
| Knowledge | MMLU | Accuracy42.18 | 71 | |
| Dialog State Tracking | SGD 15 tasks CL | Avg JGA58.6 | 23 | |
| Knowledge | MMB | Accuracy35.27 | 21 | |
| Text Classification | Restaurant, AI, ACL, AGNews Continual Learning Sequence (test) | Restaurant MF152.19 | 14 | |
| Natural Language Processing | decaNLP Tasks (unseen) | AN'30.32 | 14 | |
| Question Answering | QA Tasks (unseen) | AN' Score36.84 | 14 |