Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Rethinking Divisive Hierarchical Clustering from a Distributional Perspective

About

We uncover that current objective-based Divisive Hierarchical Clustering (DHC) methods produce a dendrogram that does not have three desired properties i.e., no unwarranted splitting, group similar clusters into a same subset, ground-truth correspondence. This shortcoming has their root cause in using a set-oriented bisecting assessment criterion. We show that this shortcoming can be addressed by using a distributional kernel, instead of the set-oriented criterion; and the resultant clusters achieve a new distribution-oriented objective to maximize the total similarity of all clusters (TSC). Our theoretical analysis shows that the resultant dendrogram guarantees a lower bound of TSC. The empirical evaluation shows the effectiveness of our proposed method on artificial and Spatial Transcriptomics (bioinformatics) datasets. Our proposed method successfully creates a dendrogram that is consistent with the biological regions in a Spatial Transcriptomics dataset, whereas other contenders fail.

Kaifeng Zhang, Kai Ming Ting, Tianrun Liang, Qiuran Zhao• 2026

Related benchmarks

TaskDatasetResultRank
Hierarchical Agglomerative ClusteringWine
Dendrogram Purity0.95
26
Hierarchical ClusteringLSVT
Dendrogram Purity74
6
Hierarchical Clusteringmusk
Dendrogram Purity57
6
Hierarchical ClusteringSpam
Dendrogram Purity84
6
Hierarchical ClusteringSTL-10
Dendrogram Purity0.63
6
Hierarchical ClusteringALLAML
Dendrogram Purity73
6
Hierarchical ClusteringSEEDS
Dendrogram Purity87
6
Hierarchical ClusteringWDBC
Dendrogram Purity90
6
Hierarchical ClusteringLandCover
Dendrogram Purity55
6
Hierarchical Clusteringbanknote
Dendrogram Purity97
6
Showing 10 of 14 rows

Other info

Follow for update