Bringing Clustering to MLL: Weakly-Supervised Clustering for Partial Multi-Label Learning
About
Label noise in multi-label learning (MLL) poses significant challenges for model training, particularly in partial multi-label learning (PML) where candidate labels contain both relevant and irrelevant labels. While clustering offers a natural approach to exploit data structure for noise identification, traditional clustering methods cannot be directly applied to multi-label scenarios due to a fundamental incompatibility: clustering produces membership values that sum to one per instance, whereas multi-label assignments require binary values that can sum to any number. We propose a novel weakly-supervised clustering approach for PML (WSC-PML) that bridges clustering and multi-label learning through membership matrix decomposition. Our key innovation decomposes the clustering membership matrix $\mathbf{A}$ into two components: $\mathbf{A} = \mathbf{\Pi} \odot \mathbf{F}$, where $\mathbf{\Pi}$ maintains clustering constraints while $\mathbf{F}$ preserves multi-label characteristics. This decomposition enables seamless integration of unsupervised clustering with multi-label supervision for effective label noise handling. WSC-PML employs a three-stage process: initial prototype learning from noisy labels, adaptive confidence-based weak supervision construction, and joint optimization via iterative clustering refinement. Extensive experiments on 24 datasets demonstrate that our approach outperforms six state-of-the-art methods across all evaluation metrics.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Partial Multi-Label Learning | Birds | Average Precision62.9 | 48 | |
| Partial Multi-Label Learning | EMOTIONS | Average Precision80.5 | 48 | |
| Partial Multi-Label Learning | Image | Average Precision0.814 | 48 | |
| Partial Multi-Label Learning | Yeast | Average Precision75.8 | 39 | |
| Partial Multi-Label Learning | Yeast | Ranking Loss0.156 | 37 | |
| Partial Multi-Label Learning | Birds | -- | 27 | |
| Partial Multi-Label Learning | EMOTIONS | Ranking Loss0.164 | 26 | |
| Partial Multi-Label Learning | Image | Ranking Loss0.152 | 24 | |
| Partial Multi-Label Learning | medical | Ranking Loss0.03 | 21 | |
| Partial Multi-Label Learning | medical | Average Precision87.5 | 21 |