How to Achieve the Intended Aim of Deep Clustering Now, without Deep Learning

About

Deep clustering (DC) is often quoted to have a key advantage over $k$-means clustering. Yet, this advantage is often demonstrated using image datasets only, and it is unclear whether it addresses the fundamental limitations of $k$-means clustering. Deep Embedded Clustering (DEC) learns a latent representation via an autoencoder and performs clustering based on a $k$-means-like procedure, while the optimization is conducted in an end-to-end manner. This paper investigates whether the deep-learned representation has enabled DEC to overcome the known fundamental limitations of $k$-means clustering, i.e., its inability to discover clusters of arbitrary shapes, varied sizes and densities. Our investigations on DEC have a wider implication on deep clustering methods in general. Notably, none of these methods exploit the underlying data distribution. We uncover that a non-deep learning approach achieves the intended aim of deep clustering by making use of distributional information of clusters in a dataset to effectively address these fundamental limitations.

Kai Ming Ting, Wei-Jie Xu, Hang Zhang• 2026

Related benchmarks

Task	Dataset	Result
Image Clustering	CIFAR-10	NMI0.74	318
Image Clustering	STL-10	--	282
Image Clustering	ImageNet-10	NMI0.88	220
Clustering	Imagenet Dogs	NMI51	105
Clustering	COIL-20	--	47
Clustering	DLPFC	ARI54	30
Clustering	AC	ARI1	27
Clustering	MNIST	NMI82	24
Clustering	MNIST	ARI0.77	19
Clustering	spiral	ARI1	7

Showing 10 of 28 rows

Other info

Follow for update

@wizwand_team Discord