SEHFS: Structural Entropy-Guided High-Order Correlation Learning for Multi-View Multi-Label Feature Selection
About
In recent years, multi-view multi-label learning (MVML) has attracted extensive attention due to its close alignment to real-world scenarios. Information-theoretic methods have gained prominence for learning nonlinear correlations. However, two key challenges persist: first, features in real-world data commonly exhibit high-order structural correlations, but existing information-theoretic methods struggle to learn such correlations; second, commonly relying on heuristic optimization, information-theoretic methods are prone to converging to local optima. To address these two challenges, we propose a novel method called Structural Entropy Guided High-Order Correlation Learning for Multi-View Multi-Label Feature Selection (SEHFS). The core idea of SEHFS is to convert the feature graph into a structural-entropy-minimizing encoding tree, quantifying the information cost of high-order dependencies and thus learning high-order feature correlations beyond pairwise correlations. Specifically, features exhibiting strong high-order redundancy are grouped into a single cluster within the encoding tree, while inter-cluster feaeture correlations are minimized, thereby eliminating redundancy both within and across clusters. Furthermore, a new framework based on the fusion of information theory and matrix methods is adopted, which learns a shared semantic matrix and view-specific contribution matrices to reconstruct a global view matrix, thereby enhancing the information-theoretic method and balancing the global and local optimization. The ability of structural entropy to learn high-order correlations is theoretically established, and and both experiments on eight datasets from various domains and ablation studies demonstrate that SEHFS achieves superior performance in feature selection.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | VOC 07 | mAP59.2 | 27 | |
| Multi-Label Classification | MIRFLICKR | AP69.2 | 17 | |
| Gene function prediction | Yeast | Hamming Loss0.223 | 8 | |
| Image Annotation | Corel5k | Hamming Loss1.3 | 8 | |
| Image Annotation | IAPRTC12 | Hamming Loss1.7 | 8 | |
| Image Annotation | IAPRTC12 | AP21.5 | 8 | |
| Image Classification | VOC 07 | Hamming Loss7.9 | 8 | |
| Image Classification | VOC07 | Ranking Loss0.194 | 8 | |
| Image Retrieval | Scene | Hamming Loss9.2 | 8 | |
| Image Retrieval | Object | Hamming Loss5.2 | 8 |