Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Taming Pretrained Transformers for Extreme Multi-label Text Classification

About

We consider the extreme multi-label text classification (XMC) problem: given an input text, return the most relevant labels from a large label collection. For example, the input text could be a product description on Amazon.com and the labels could be product categories. XMC is an important yet challenging problem in the NLP community. Recently, deep pretrained transformer models have achieved state-of-the-art performance on many NLP tasks including sentence classification, albeit with small label sets. However, naively applying deep transformer models to the XMC problem leads to sub-optimal performance due to the large output space and the label sparsity issue. In this paper, we propose X-Transformer, the first scalable approach to fine-tuning deep transformer models for the XMC problem. The proposed method achieves new state-of-the-art results on four XMC benchmark datasets. In particular, on a Wiki dataset with around 0.5 million labels, the prec@1 of X-Transformer is 77.28%, a substantial improvement over state-of-the-art XMC approaches Parabel (linear) and AttentionXML (neural), which achieve 68.70% and 76.95% precision@1, respectively. We further apply X-Transformer to a product2query dataset from Amazon and gained 10.7% relative improvement on prec@1 over Parabel.

Wei-Cheng Chang, Hsiang-Fu Yu, Kai Zhong, Yiming Yang, Inderjit Dhillon• 2019

Related benchmarks

TaskDatasetResultRank
Extreme Multi-label ClassificationAmazon-670K
P@148.07
41
Extreme Multi-label ClassificationAmazon-3M
Precision@151.2
33
Extreme ClassificationLF-AmazonTitles-131K
P@130.43
32
Extreme Multi-label ClassificationWiki-500K
P@162.62
30
Extreme Multi-label ClassificationWiki10-31K
PSP@115.12
21
Extreme Multi-label ClassificationAmazonCat-13K
PSP@150.36
21
Extreme Multi-label ClassificationEurlex-4K
Training Time (hours)7.5
8
Extreme Multi-label ClassificationAmazonCat-13K
Training Time (hours)147.6
8
Extreme Multi-label ClassificationAmazon-670K
Training Time (hours)514.8
8
Extreme Multi-label ClassificationAmazon-3M
Training Time (hours)542
6
Showing 10 of 19 rows

Other info

Follow for update