Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction

About

Dimensionality reduction is crucial both for visualization and preprocessing high dimensional data for machine learning. We introduce a novel method based on a hierarchy built on 1-nearest neighbor graphs in the original space which is used to preserve the grouping properties of the data distribution on multiple levels. The core of the proposal is an optimization-free projection that is competitive with the latest versions of t-SNE and UMAP in performance and visualization quality while being an order of magnitude faster in run-time. Furthermore, its interpretable mechanics, the ability to project new data, and the natural separation of data clusters in visualizations make it a general purpose unsupervised dimension reduction technique. In the paper, we argue about the soundness of the proposed method and evaluate it on a diverse collection of datasets with sizes varying from 1K to 11M samples and dimensions from 28 to 16K. We perform comparisons with other state-of-the-art methods on multiple metrics and target dimensions highlighting its efficiency and performance. Code is available at https://github.com/koulakis/h-nne

M. Saquib Sarfraz, Marios Koulakis, Constantin Seibold, Rainer Stiefelhagen• 2022

Related benchmarks

Task	Dataset	Result
Dimensionality Reduction	CIFAR10	Trustworthiness Score0.907	45
Classification	warpPIE 10P	Accuracy72.5	26
Dimensionality Reduction	F-MNIST	Triplet Centroid Accuracy92.5	10
Dimensionality Reduction	MNIST	Triplet Centroid Accuracy67.1	10
Classification	SAM561	Accuracy83.8	8
Classification	HCL500	Accuracy62.2	8
Classification	MC1374	Accuracy62.3	8
Classification	PROT579	Accuracy82.4	8
Classification	GA1457	Accuracy0.774	8
Dimensionality Reduction	COIL20	Triplet Centroid Accuracy79.9	5

Showing 10 of 28 rows

Other info

Code

Follow for update

@wizwand_team Discord