Multi-Modal Deep Clustering: Unsupervised Partitioning of Images
About
The clustering of unlabeled raw images is a daunting task, which has recently been approached with some success by deep learning methods. Here we propose an unsupervised clustering framework, which learns a deep neural network in an end-to-end fashion, providing direct cluster assignments of images without additional processing. Multi-Modal Deep Clustering (MMDC), trains a deep network to align its image embeddings with target points sampled from a Gaussian Mixture Model distribution. The cluster assignments are then determined by mixture component association of image embeddings. Simultaneously, the same deep network is trained to solve an additional self-supervised task of predicting image rotations. This pushes the network to learn more meaningful image representations that facilitate a better clustering. Experimental results show that MMDC achieves or exceeds state-of-the-art performance on six challenging benchmarks. On natural image datasets we improve on previous results with significant margins of up to 20% absolute accuracy points, yielding an accuracy of 82% on CIFAR-10, 45% on CIFAR-100 and 69% on STL-10.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Clustering | CIFAR-10 | NMI0.72 | 243 | |
| Image Clustering | STL-10 | ACC74.1 | 229 | |
| Clustering | CIFAR-10 (test) | Accuracy70 | 184 | |
| Image Clustering | ImageNet-10 | NMI0.732 | 166 | |
| Clustering | STL-10 (test) | Accuracy61.1 | 146 | |
| Clustering | CIFAR-100 (test) | ACC31.2 | 110 | |
| Image Clustering | CIFAR-100 | ACC46.4 | 101 | |
| Clustering | MNIST | NMI0.973 | 92 | |
| Clustering | ImageNet-10 (test) | ACC81.1 | 69 | |
| Image Clustering | Tiny-ImageNet | ACC0.121 | 37 |