Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

New Intent Discovery with Pre-training and Contrastive Learning

About

New intent discovery aims to uncover novel intent categories from user utterances to expand the set of supported intent classes. It is a critical task for the development and service expansion of a practical dialogue system. Despite its importance, this problem remains under-explored in the literature. Existing approaches typically rely on a large amount of labeled utterances and employ pseudo-labeling methods for representation learning and clustering, which are label-intensive, inefficient, and inaccurate. In this paper, we provide new solutions to two important research questions for new intent discovery: (1) how to learn semantic utterance representations and (2) how to better cluster utterances. Particularly, we first propose a multi-task pre-training strategy to leverage rich unlabeled data along with external labeled data for representation learning. Then, we design a new contrastive loss to exploit self-supervisory signals in unlabeled data for clustering. Extensive experiments on three intent recognition benchmarks demonstrate the high effectiveness of our proposed method, which outperforms state-of-the-art methods by a large margin in both unsupervised and semi-supervised scenarios. The source code will be available at https://github.com/zhang-yu-wei/MTP-CLNN.

Yuwei Zhang, Haode Zhang, Li-Ming Zhan, Albert Y.S. Lam, Xiao-Ming Wu• 2022

Related benchmarks

TaskDatasetResultRank
New Intent DiscoveryBANKING
NMI90.51
76
New Intent DiscoveryM-CID
NMI85.78
75
Open intent recognitionStackOverflow
Accuracy88.98
54
Generalized Category DiscoveryBanking (test)
Accuracy70.97
28
Generalized Category DiscoveryCLINC (test)
Accuracy86.18
28
Generalized Category DiscoveryStackOverflow (test)
Accuracy80.36
28
New Intent DiscoveryStackOverflow
NMI78.71
27
New Intent DiscoveryCLINC
NMI95.44
20
New Intent DiscoverySNIPS
NMI89.95
19
New Intent DiscoveryDBpedia
NMI80.17
19
Showing 10 of 10 rows

Other info

Code

Follow for update