Discovering New Intents with Deep Aligned Clustering
About
Discovering new intents is a crucial task in dialogue systems. Most existing methods are limited in transferring the prior knowledge from known intents to new intents. They also have difficulties in providing high-quality supervised signals to learn clustering-friendly features for grouping unlabeled intents. In this work, we propose an effective method, Deep Aligned Clustering, to discover new intents with the aid of the limited known intent data. Firstly, we leverage a few labeled known intent samples as prior knowledge to pre-train the model. Then, we perform k-means to produce cluster assignments as pseudo-labels. Moreover, we propose an alignment strategy to tackle the label inconsistency problem during clustering assignments. Finally, we learn the intent representations under the supervision of the aligned pseudo-labels. With an unknown number of new intents, we predict the number of intent categories by eliminating low-confidence intent-wise clusters. Extensive experiments on two benchmark datasets show that our method is more robust and achieves substantial improvements over the state-of-the-art methods. The codes are released at https://github.com/thuiar/DeepAligned-Clustering.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| New Intent Discovery | BANKING | NMI84.78 | 56 | |
| New Intent Discovery | M-CID | NMI77.37 | 56 | |
| Open intent recognition | StackOverflow | Accuracy81.45 | 54 | |
| Intent Clustering | CLINC full 2019 | NMI93.89 | 13 | |
| Intent Clustering | BANKING 2020 (full) | NMI79.56 | 13 | |
| Intent Clustering | CLINC 1.0 (test) | K Predicted130 | 9 | |
| Intent Clustering | BANKING 1.0 (test) | K (Pred)67 | 9 | |
| Generalized Intent Discovery | GID-SD v1 (test) | IND ACC91.72 | 5 | |
| Generalized Intent Discovery | GID-CD v1 (test) | IND ACC97.85 | 5 | |
| Generalized Intent Discovery | GID-MD v1 (test) | IND Accuracy97.85 | 5 |