Out-of-Scope Intent Detection with Self-Supervision and Discriminative Training
About
Out-of-scope intent detection is of practical importance in task-oriented dialogue systems. Since the distribution of outlier utterances is arbitrary and unknown in the training stage, existing methods commonly rely on strong assumptions on data distribution such as mixture of Gaussians to make inference, resulting in either complex multi-step training procedures or hand-crafted rules such as confidence threshold selection for outlier detection. In this paper, we propose a simple yet effective method to train an out-of-scope intent classifier in a fully end-to-end manner by simulating the test scenario in training, which requires no assumption on data distribution and no additional post-processing or threshold setting. Specifically, we construct a set of pseudo outliers in the training stage, by generating synthetic outliers using inliner features via self-supervision and sampling out-of-scope sentences from easily available open-domain datasets. The pseudo outliers are used to train a discriminative classifier that can be directly applied to and generalize well on the test task. We evaluate our method extensively on four benchmark dialogue datasets and observe significant improvements over state-of-the-art approaches. Our code has been released at https://github.com/liam0949/DCLOOS.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Unknown Intent Detection | StackOverflow 50% seen classes (test) | Accuracy75.08 | 11 | |
| open-set relation extraction | FewRel (test) | Accuracy71.19 | 8 | |
| open-set relation extraction | TACRED (test) | Accuracy0.7155 | 8 | |
| Relation Classification | FewRel | Accuracy91.97 | 8 | |
| Relation Classification | TACRED n known relations | Accuracy93.1 | 8 | |
| Unknown Intent Detection | CLINC150 25% seen classes (test) | Accuracy88.44 | 6 | |
| Unknown Intent Detection | StackOverflow 25% seen classes (test) | Accuracy68.74 | 6 | |
| Unknown Intent Detection | Banking 25% seen classes (test) | Accuracy74.11 | 6 | |
| Unknown Intent Detection | M-CID-EN 25% seen classes (test) | Accuracy87.08 | 6 | |
| Unknown Intent Detection | CLINC150 50% seen classes (test) | Accuracy88.33 | 6 |