Rethinking Multi-view Representation Learning via Distilled Disentangling

About

Multi-view representation learning aims to derive robust representations that are both view-consistent and view-specific from diverse data sources. This paper presents an in-depth analysis of existing approaches in this domain, highlighting a commonly overlooked aspect: the redundancy between view-consistent and view-specific representations. To this end, we propose an innovative framework for multi-view representation learning, which incorporates a technique we term 'distilled disentangling'. Our method introduces the concept of masked cross-view prediction, enabling the extraction of compact, high-quality view-consistent representations from various sources without incurring extra computational overhead. Additionally, we develop a distilled disentangling module that efficiently filters out consistency-related information from multi-view representations, resulting in purer view-specific representations. This approach significantly reduces redundancy between view-consistent and view-specific representations, enhancing the overall efficiency of the learning process. Our empirical evaluations reveal that higher mask ratios substantially improve the quality of view-consistent representations. Moreover, we find that reducing the dimensionality of view-consistent representations relative to that of view-specific representations further refines the quality of the combined representations. Our code is accessible at: https://github.com/Guanzhou-Ke/MRDD.

Guanzhou Ke, Bo Wang, Xiaoli Wang, Shengfeng He• 2024

Related benchmarks

Task	Dataset	Result
Clustering	COIL-20	ACC69.18	47
Clustering	COIL-100	ACC65.29	28
Clustering	E-MNIST	Accuracy75.93	25
Clustering	E-FMNIST	ACC (Clustering)58.25	13
Clustering	Office-31	ACC_clu37.14	13
Classification	E-MNIST (test)	Accuracy98.37	11
Classification	COIL-100 (test)	Accuracy91.17	11
Classification	Office-31 (test)	Accuracy (cls)73.51	11
Classification	E-FMNIST (test)	Accuracy88.78	11
Classification	COIL-20 (test)	Classification Accuracy (ACCcls)95.97	11

Showing 10 of 10 rows

Other info

Code

Follow for update

@wizwand_team Discord