Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Efficient Algorithms for t-distributed Stochastic Neighborhood Embedding

About

t-distributed Stochastic Neighborhood Embedding (t-SNE) is a method for dimensionality reduction and visualization that has become widely popular in recent years. Efficient implementations of t-SNE are available, but they scale poorly to datasets with hundreds of thousands to millions of high dimensional data-points. We present Fast Fourier Transform-accelerated Interpolation-based t-SNE (FIt-SNE), which dramatically accelerates the computation of t-SNE. The most time-consuming step of t-SNE is a convolution that we accelerate by interpolating onto an equispaced grid and subsequently using the fast Fourier transform to perform the convolution. We also optimize the computation of input similarities in high dimensions using multi-threaded approximate nearest neighbors. We further present a modification to t-SNE called "late exaggeration," which allows for easier identification of clusters in t-SNE embeddings. Finally, for datasets that cannot be loaded into the memory, we present out-of-core randomized principal component analysis (oocPCA), so that the top principal components of a dataset can be computed without ever fully loading the matrix, hence allowing for t-SNE of large datasets to be computed on resource-limited machines.

George C. Linderman, Manas Rachh, Jeremy G. Hoskins, Stefan Steinerberger, Yuval Kluger• 2017

Related benchmarks

TaskDatasetResultRank
Dimension ReductionscRNA 21086x1000
Runtime (s)131
6
Dimension ReductionCOIL100 7200x49152
Runtime (s)2.68e+3
6
Dimension ReductionCOIL20 1440x16384
Runtime (s)75
6
Dimension ReductionPen Digits 1797x64
Runtime (s)48
6
Dimension ReductionShuttle 58000x9
Runtime (s)108
5
Dimension ReductionMNIST 70000x784
Runtime (s)292
5
Dimension ReductionF-MNIST 70000x784
Runtime (s)278
5
Dimension ReductionFlow 100000x17
Runtime (s)164
5
Dimension ReductionGoogle News 200000x300
Runtime (s)652
4
Showing 9 of 9 rows

Other info

Follow for update