ZeroDL: Zero-shot Distribution Learning for Text Clustering via Large Language Models
About
The advancements in large language models (LLMs) have brought significant progress in NLP tasks. However, if a task cannot be fully described in prompts, the models could fail to carry out the task. In this paper, we propose a simple yet effective method to contextualize a task toward a LLM. The method utilizes (1) open-ended zero-shot inference from the entire dataset, (2) aggregate the inference results, and (3) finally incorporate the aggregated meta-information for the actual task. We show the effectiveness in text clustering tasks, empowering LLMs to perform text-to-text-based clustering and leading to improvements on several datasets. Furthermore, we explore the generated class labels for clustering, showing how the LLM understands the task through data.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Text Clustering | DBp F | Accuracy66.7 | 39 | |
| Short Text Clustering | AGNews | ACC84.3 | 38 | |
| Clustering | IMDB | Accuracy94.3 | 34 | |
| Text Clustering | SST-5 | Accuracy52.6 | 25 | |
| Text Clustering | SST-2 | Accuracy88.3 | 25 | |
| Text Clustering | YRev | Accuracy53.8 | 25 | |
| Text Clustering | DBp B | Accuracy78.3 | 25 | |
| Text Clustering | Yah B | Accuracy74.4 | 21 | |
| Text Clustering | Yah(F) | Accuracy52.7 | 21 | |
| Text Clustering | Aggregate | Macro Score70.3 | 21 |