Fast Exact Leverage Score Sampling from Khatri-Rao Products with Applications to Tensor Decomposition
About
We present a data structure to randomly sample rows from the Khatri-Rao product of several matrices according to the exact distribution of its leverage scores. Our proposed sampler draws each row in time logarithmic in the height of the Khatri-Rao product and quadratic in its column count, with persistent space overhead at most the size of the input matrices. As a result, it tractably draws samples even when the matrices forming the Khatri-Rao product have tens of millions of rows each. When used to sketch the linear least squares problems arising in CANDECOMP / PARAFAC tensor decomposition, our method achieves lower asymptotic complexity per solve than recent state-of-the-art methods. Experiments on billion-scale sparse tensors validate our claims, with our algorithm achieving higher accuracy than competing methods as the decomposition rank grows.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Sparse Tensor Decomposition | UBER | Fit0.243 | 20 | |
| Sparse Tensor Decomposition | ENRON | Fit0.182 | 10 | |
| CP Decomposition | UBER | Avg Time per ALS Round (s)0.2 | 4 | |
| CP Decomposition | NELL-2 | Average time per ALS round (seconds)0.6 | 4 | |
| Sparse Tensor Decomposition | NELL-2 | Fit0.0814 | 4 | |
| CP Decomposition | ENRON | Avg Time per ALS Round (s)0.5 | 3 | |
| CP Decomposition | AMAZON | Average time per ALS round (seconds)3.4 | 2 | |
| Sparse Tensor Decomposition | AMAZON | Reconstruction Fit0.4 | 2 | |
| Sparse Tensor Decomposition | Fit0.109 | 2 | ||
| CP Decomposition | Average time per ALS round (seconds)26 | 1 |