Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Parametric UMAP embeddings for representation and semi-supervised learning

About

UMAP is a non-parametric graph-based dimensionality reduction algorithm using applied Riemannian geometry and algebraic topology to find low-dimensional embeddings of structured data. The UMAP algorithm consists of two steps: (1) Compute a graphical representation of a dataset (fuzzy simplicial complex), and (2) Through stochastic gradient descent, optimize a low-dimensional embedding of the graph. Here, we extend the second step of UMAP to a parametric optimization over neural network weights, learning a parametric relationship between data and embedding. We first demonstrate that Parametric UMAP performs comparably to its non-parametric counterpart while conferring the benefit of a learned parametric mapping (e.g. fast online embeddings for new data). We then explore UMAP as a regularization, constraining the latent distribution of autoencoders, parametrically varying global structure preservation, and improving classifier accuracy for semi-supervised learning by capturing structure in unlabeled data. Google Colab walkthrough: https://colab.research.google.com/drive/1WkXVZ5pnMrm17m0YgmtoNjM_XHdnE5Vp?usp=sharing

Tim Sainburg, Leland McInnes, Timothy Q Gentner• 2020

Related benchmarks

TaskDatasetResultRank
Image ClassificationMNIST
Accuracy94.1
395
ClusteringMNIST
NMI0.7824
92
ClassificationCOIL-20
Accuracy0.774
76
Dimensionality ReductionCassin's
AUC RNX37.34
63
ClassificationMNIST
Accuracy94.2
55
Dimensionality ReductionCIFAR10
Trustworthiness Score0.914
45
Dimensionality ReductionRetina
AUC R_NX Score0.3273
42
Dimensionality ReductionFMNIST
AUC R_NX Score36.62
42
Dimensionality ReductionMNIST
AUC R_NX Score31.94
42
ClassificationActivity
Accuracy90
34
Showing 10 of 56 rows

Other info

Code

Follow for update