Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AuToMATo: An Out-Of-The-Box Persistence-Based Clustering Algorithm

About

We present AuToMATo, a novel clustering algorithm based on persistent homology. While AuToMATo is not parameter-free per se, we provide default choices for its parameters that make it into an out-of-the-box clustering algorithm that performs well across the board. AuToMATo combines the existing ToMATo clustering algorithm with a bootstrapping procedure in order to separate significant peaks of an estimated density function from non-significant ones. We perform a thorough comparison of AuToMATo (with its parameters fixed to their defaults) against many other state-of-the-art clustering algorithms. We find not only that AuToMATo compares favorably against parameter-free clustering algorithms, but in many instances also significantly outperforms even the best selection of parameters for other algorithms. AuToMATo is motivated by applications in topological data analysis, in particular the Mapper algorithm, where it is desirable to work with a clustering algorithm that does not need tuning of its parameters. Indeed, we provide evidence that AuToMATo performs well when used with Mapper. Finally, we provide an open-source implementation of AuToMATo in Python that is fully compatible with the standard scikit-learn architecture.

Marius Huber, Sara Kalisnik, Patrick Schnider• 2024

Related benchmarks

TaskDatasetResultRank
ClusteringStatlog
ARI43.3
30
ClusteringYeast
ARI1.2
29
ClusteringShuttle
ARI0.685
20
ClusteringSEEDS
ARI0.789
20
ClusteringAnuran Calls
ARI0.51
20
ClusteringPenBased
ARI0.631
20
Clusteringpima
ARI0.012
20
ClusteringTUANDROMD
ARI0.051
20
Clusteringphoneme
ARI1.9
20
ClusteringMSRA25
ARI0.118
20
Showing 10 of 22 rows

Other info

Follow for update