Rethinking Multi-view Representation Learning via Distilled Disentangling
About
Multi-view representation learning aims to derive robust representations that are both view-consistent and view-specific from diverse data sources. This paper presents an in-depth analysis of existing approaches in this domain, highlighting a commonly overlooked aspect: the redundancy between view-consistent and view-specific representations. To this end, we propose an innovative framework for multi-view representation learning, which incorporates a technique we term 'distilled disentangling'. Our method introduces the concept of masked cross-view prediction, enabling the extraction of compact, high-quality view-consistent representations from various sources without incurring extra computational overhead. Additionally, we develop a distilled disentangling module that efficiently filters out consistency-related information from multi-view representations, resulting in purer view-specific representations. This approach significantly reduces redundancy between view-consistent and view-specific representations, enhancing the overall efficiency of the learning process. Our empirical evaluations reveal that higher mask ratios substantially improve the quality of view-consistent representations. Moreover, we find that reducing the dimensionality of view-consistent representations relative to that of view-specific representations further refines the quality of the combined representations. Our code is accessible at: https://github.com/Guanzhou-Ke/MRDD.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Clustering | COIL-20 | ACC69.18 | 47 | |
| Clustering | COIL-100 | ACC65.29 | 28 | |
| Clustering | E-MNIST | Accuracy75.93 | 25 | |
| Clustering | E-FMNIST | ACC (Clustering)58.25 | 13 | |
| Clustering | Office-31 | ACC_clu37.14 | 13 | |
| Classification | E-MNIST (test) | Accuracy98.37 | 11 | |
| Classification | COIL-100 (test) | Accuracy91.17 | 11 | |
| Classification | Office-31 (test) | Accuracy (cls)73.51 | 11 | |
| Classification | E-FMNIST (test) | Accuracy88.78 | 11 | |
| Classification | COIL-20 (test) | Classification Accuracy (ACCcls)95.97 | 11 |