Short Text Clustering with Transformers
About
Recent techniques for the task of short text clustering often rely on word embeddings as a transfer learning component. This paper shows that sentence vector representations from Transformers in conjunction with different clustering methods can be successfully applied to address the task. Furthermore, we demonstrate that the algorithm of enhancement of clustering via iterative classification can further improve initial clustering performance with different classifiers, including those based on pre-trained Transformer language models.
Leonid Pugachev, Mikhail Burtsev• 2021
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Short Text Clustering | Tweet | -- | 28 | |
| Short Text Clustering | AG News (test) | Accuracy86.53 | 18 | |
| Short Text Clustering | Stack Overflow (test) | Accuracy84.72 | 5 | |
| Short Text Clustering | Search Snippets (test) | Accuracy87.67 | 5 | |
| Short Text Clustering | Biomedical corpus (test) | Accuracy47.78 | 5 | |
| Short Text Clustering | Google News Title only | -- | 5 | |
| Short Text Clustering | Google News Title and Snippet | -- | 1 | |
| Short Text Clustering | Google News S Snippet only | -- | 1 |
Showing 8 of 8 rows