Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-Modal Deep Clustering: Unsupervised Partitioning of Images

About

The clustering of unlabeled raw images is a daunting task, which has recently been approached with some success by deep learning methods. Here we propose an unsupervised clustering framework, which learns a deep neural network in an end-to-end fashion, providing direct cluster assignments of images without additional processing. Multi-Modal Deep Clustering (MMDC), trains a deep network to align its image embeddings with target points sampled from a Gaussian Mixture Model distribution. The cluster assignments are then determined by mixture component association of image embeddings. Simultaneously, the same deep network is trained to solve an additional self-supervised task of predicting image rotations. This pushes the network to learn more meaningful image representations that facilitate a better clustering. Experimental results show that MMDC achieves or exceeds state-of-the-art performance on six challenging benchmarks. On natural image datasets we improve on previous results with significant margins of up to 20% absolute accuracy points, yielding an accuracy of 82% on CIFAR-10, 45% on CIFAR-100 and 69% on STL-10.

Guy Shiran, Daphna Weinshall• 2019

Related benchmarks

TaskDatasetResultRank
Image ClusteringCIFAR-10
NMI0.72
318
Image ClusteringSTL-10
ACC74.1
282
Image ClusteringImageNet-10
NMI0.732
201
ClusteringCIFAR-10 (test)
Accuracy70
190
ClusteringSTL-10 (test)
Accuracy61.1
152
ClusteringCIFAR-100 (test)
ACC31.2
123
ClusteringMNIST
NMI0.973
113
Image ClusteringCIFAR-100
ACC46.4
111
ClusteringImageNet-10 (test)
ACC81.1
74
ClusteringImageNet-Dogs (test)
NMI0.274
40
Showing 10 of 11 rows

Other info

Code

Follow for update