Knowledge Graph-Enhanced Zero-Shot Topic Classification: A Multi-Strategy Comparative Study
About
Multi-label topic classification without labeled training data is a challenging task, specially when documents contain complex relational information. We present a zero-shot multi-label topic classification framework and systematically investigate how per-article knowledge graph augmentation affects its performance. The base framework classifies topics in documents without labeled training data and has four variants: article-only classification, keyword-enhanced classification, and self-consistency decoding variants of both. Then, we augment each base variant with per article knowledge graph. This graph is extracted from the input document through a pipeline similar to KGGen based on subject-predicate-object triples. We test all eight methods, four base and four graph augmented on fifteen LLMs and eight multi-label datasets across different domains. For the base framework, keyword-enhanced classification (AK) is the best performing method, and six out of fifteen LLMs surpass the sentence-encoder baseline. Graph augmentation has positive and negative impacts on small and large models, respectively. This shows that larger models already contain enough relational information from pretraining. Furthermore, the self-consistency decoding variant does not show performance improvements in any experiment while increasing computation costs about fivefold.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multi-Label Classification | medical | Micro F1-Score76.6 | 11 | |
| Classification | medical | F1 Score76.6 | 10 | |
| Classification | News | F1 Score75.2 | 3 | |
| Multi-label topic classification | News | Micro F1 Score74.9 | 3 | |
| Multi-label topic classification | Cell. phone | Micro F176.3 | 3 | |
| Multi-label topic classification | Digital Camera 1 | Micro-Avg F1 Score75 | 3 | |
| Multi-label topic classification | DVD player | Micro-average F169.9 | 3 | |
| Multi-label topic classification | SemEval | Micro-F165.5 | 3 | |
| Classification | Cellular phone | F1 Score76.3 | 2 | |
| Classification | Digital cam 1 | F1 Score75.2 | 2 |