Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification

About

Extreme multi-label text classification (XMC) seeks to find relevant labels from an extreme large label collection for a given text input. Many real-world applications can be formulated as XMC problems, such as recommendation systems, document tagging and semantic search. Recently, transformer based XMC methods, such as X-Transformer and LightXML, have shown significant improvement over other XMC methods. Despite leveraging pre-trained transformer models for text representation, the fine-tuning procedure of transformer models on large label space still has lengthy computational time even with powerful GPUs. In this paper, we propose a novel recursive approach, XR-Transformer to accelerate the procedure through recursively fine-tuning transformer models on a series of multi-resolution objectives related to the original XMC objective function. Empirical results show that XR-Transformer takes significantly less training time compared to other transformer-based XMC models while yielding better state-of-the-art results. In particular, on the public Amazon-3M dataset with 3 million labels, XR-Transformer is not only 20x faster than X-Transformer but also improves the Precision@1 from 51% to 54%.

Jiong Zhang, Wei-cheng Chang, Hsiang-fu Yu, Inderjit S. Dhillon• 2021

Related benchmarks

TaskDatasetResultRank
Extreme Multi-label ClassificationAmazon-670K
P@150.11
41
Extreme Multi-label ClassificationAmazon-3M
Precision@154.2
33
Extreme ClassificationLF-AmazonTitles-131K
P@138.49
32
Extreme Multi-label ClassificationWiki-500K
P@179.4
30
Extreme Multi-label ClassificationWiki10-31K
PSP@112.25
21
Extreme Multi-label ClassificationAmazonCat-13K
PSP@150.72
21
Extreme Multi-label ClassificationAmazonCat-13K legacy (test)
Precision@10.9679
11
Extreme Multi-label ClassificationWiki10-31K legacy (test)
P@188.69
11
Extreme Multi-label ClassificationAmazon-670K large scale XMC (test)
PSP@136.16
9
Extreme Multi-label ClassificationEurlex-4K
Training Time (hours)0.8
8
Showing 10 of 24 rows

Other info

Code

Follow for update