CascadeXML: Rethinking Transformers for End-to-end Multi-resolution Training in Extreme Multi-label Classification

About

Extreme Multi-label Text Classification (XMC) involves learning a classifier that can assign an input with a subset of most relevant labels from millions of label choices. Recent approaches, such as XR-Transformer and LightXML, leverage a transformer instance to achieve state-of-the-art performance. However, in this process, these approaches need to make various trade-offs between performance and computational requirements. A major shortcoming, as compared to the Bi-LSTM based AttentionXML, is that they fail to keep separate feature representations for each resolution in a label tree. We thus propose CascadeXML, an end-to-end multi-resolution learning pipeline, which can harness the multi-layered architecture of a transformer model for attending to different label resolutions with separate feature representations. CascadeXML significantly outperforms all existing approaches with non-trivial gains obtained on benchmark datasets consisting of up to three million labels. Code for CascadeXML will be made publicly available at \url{https://github.com/xmc-aalto/cascadexml}.

Siddhant Kharbanda, Atmadeep Banerjee, Erik Schultheis, Rohit Babbar• 2022

Related benchmarks

Task	Dataset	Result
Extreme Multi-label Classification	Amazon-670K	P@152.15	63
Extreme Multi-label Classification	Amazon-3M	Precision@153.91	44
Extreme Multi-label Classification	Wiki-500K	P@181.13	30
Extreme Multi-label Classification	AmazonCat-13K	PSP@152.68	21
Extreme Multi-label Classification	Wiki10-31K	PSP@113.36	21
Extreme Multi-label Classification	Wiki10-31K legacy (test)	P@189.74	11
Extreme Multi-label Classification	AmazonCat-13K legacy (test)	Precision@10.969	11
Extreme Multi-label Classification	Amazon-670K large scale XMC (test)	PSP@130.2	9
Extreme Multi-label Classification	Wiki10-31K (test)	PSP@10.132	5

Showing 9 of 9 rows

Other info

Code

Follow for update

@wizwand_team Discord