Learning Disentangled Representations for Generalized Multi-view Clustering
About
Multi-View Clustering (MVC) has gained significant attention for its ability to leverage complementary information across diverse views. However, existing deep MVC methods often struggle with view-distribution entanglement during cross-view fusion, which hampers the quality of the shared latent space and leads to suboptimal Figures. To address this issue, we propose the Generalized Multi-view Auto-Encoder (GMAE), a framework designed to preserve cross-view complementarity through disentangled representation learning. Specifically, GMAE employs dual-path autoencoders to decouple source features into view-specific and view-common embeddings, facilitating the discovery of clearer clustering structures. We further construct cross-view adversarial discriminators to guide view-specific encoders in capturing more discriminative features. By strategically modulating mutual information, GMAE effectively aligns distributions and prevents representation collapse, ensuring the generation of robust, non-trivial embeddings. Comprehensive experiments on 13 benchmark datasets demonstrate that GMAE consistently outperforms state-of-the-art methods in both complete and incomplete MVC tasks. Our code implementation is available at the repository: https://github.com/obananas/GMAE.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Clustering | STL-10 | ACC96.25 | 64 | |
| Multi-view Clustering | Synthetic3d | ACC98 | 42 | |
| Multi-view Clustering | LGG | Accuracy92.16 | 33 | |
| Multi-view Clustering | BRCA | Accuracy (ACC)67.59 | 24 | |
| Multi-view Clustering | Dermatology | Accuracy91.62 | 24 | |
| Clustering | Digits (2V) | Accuracy95.55 | 16 | |
| Clustering | Digits (3V) | Accuracy (ACC)95.9 | 16 | |
| Clustering | Digits (4V) | Accuracy (Digits 4V)96.5 | 16 | |
| Clustering | Digits (6V) | Accuracy97.45 | 16 | |
| Multi-view Clustering | Wikipedia | Accuracy (ACC)62.18 | 16 |