Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Ultrametric Fitting by Gradient Descent

About

We study the problem of fitting an ultrametric distance to a dissimilarity graph in the context of hierarchical cluster analysis. Standard hierarchical clustering methods are specified procedurally, rather than in terms of the cost function to be optimized. We aim to overcome this limitation by presenting a general optimization framework for ultrametric fitting. Our approach consists of modeling the latter as a constrained optimization problem over the continuous space of ultrametrics. So doing, we can leverage the simple, yet effective, idea of replacing the ultrametric constraint with a min-max operation injected directly into the cost function. The proposed reformulation leads to an unconstrained optimization problem that can be efficiently solved by gradient descent methods. The flexibility of our framework allows us to investigate several cost functions, following the classic paradigm of combining a data fidelity term with a regularization. While we provide no theoretical guarantee to find the global optimum, the numerical results obtained over a number of synthetic and real datasets demonstrate the good performance of our approach with respect to state-of-the-art agglomerative algorithms. This makes us believe that the proposed framework sheds new light on the way to design a new generation of hierarchical clustering methods. Our code is made publicly available at \url{https://github.com/PerretB/ultrametric-fitting}.

Giovanni Chierchia, Benjamin Perret• 2019

Related benchmarks

TaskDatasetResultRank
Hierarchical Agglomerative ClusteringWine
Dendrogram Purity0.789
26
Hierarchical Agglomerative ClusteringIris
Dendrogram Purity0.815
20
Hierarchical Agglomerative ClusteringDigits
Dendrogram Purity0.697
20
Hierarchical ClusteringBr. Cancer
DP Score95
10
Hierarchical ClusteringSpambase
DP59.9
10
Hierarchical ClusteringSpambase
Dasgupta's Cost5.99e+7
10
Hierarchical ClusteringOpticalDigits
Dasgupta Cost4.51e+5
10
Hierarchical Clusteringzoo
DP0.933
10
Hierarchical Clusteringpendigits
DP70
9
Showing 9 of 9 rows

Other info

Follow for update