Optimal Sketching for Kronecker Product Regression and Low Rank Approximation
About
We study the Kronecker product regression problem, in which the design matrix is a Kronecker product of two or more matrices. Given $A_i \in \mathbb{R}^{n_i \times d_i}$ for $i=1,2,\dots,q$ where $n_i \gg d_i$ for each $i$, and $b \in \mathbb{R}^{n_1 n_2 \cdots n_q}$, let $\mathcal{A} = A_1 \otimes A_2 \otimes \cdots \otimes A_q$. Then for $p \in [1,2]$, the goal is to find $x \in \mathbb{R}^{d_1 \cdots d_q}$ that approximately minimizes $\|\mathcal{A}x - b\|_p$. Recently, Diao, Song, Sun, and Woodruff (AISTATS, 2018) gave an algorithm which is faster than forming the Kronecker product $\mathcal{A}$ Specifically, for $p=2$ their running time is $O(\sum_{i=1}^q \text{nnz}(A_i) + \text{nnz}(b))$, where nnz$(A_i)$ is the number of non-zero entries in $A_i$. Note that nnz$(b)$ can be as large as $n_1 \cdots n_q$. For $p=1,$ $q=2$ and $n_1 = n_2$, they achieve a worse bound of $O(n_1^{3/2} \text{poly}(d_1d_2) + \text{nnz}(b))$. In this work, we provide significantly faster algorithms. For $p=2$, our running time is $O(\sum_{i=1}^q \text{nnz}(A_i) )$, which has no dependence on nnz$(b)$. For $p<2$, our running time is $O(\sum_{i=1}^q \text{nnz}(A_i) + \text{nnz}(b))$, which matches the prior best running time for $p=2$. We also consider the related all-pairs regression problem, where given $A \in \mathbb{R}^{n \times d}, b \in \mathbb{R}^n$, we want to solve $\min_{x} \|\bar{A}x - \bar{b}\|_p$, where $\bar{A} \in \mathbb{R}^{n^2 \times d}, \bar{b} \in \mathbb{R}^{n^2}$ consist of all pairwise differences of the rows of $A,b$. We give an $O(\text{nnz}(A))$ time algorithm for $p \in[1,2]$, improving the $\Omega(n^2)$ time needed to form $\bar{A}$. Finally, we initiate the study of Kronecker product low rank and low $t$-rank approximation. For input $\mathcal{A}$ as above, we give $O(\sum_{i=1}^q \text{nnz}(A_i))$ time algorithms, which is much faster than computing $\mathcal{A}$.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Kronecker regression | Random n^2 x d^2 design matrix | Loss4.00e-4 | 118 | |
| Kronecker regression | Synthetic n^2 x d^2 design matrix (test) | Running Time (s)0.0051 | 118 | |
| Low-rank tensor decomposition | Cardiac MRI tensor | Avg Iteration Time (s)1.3 | 64 | |
| Tucker decomposition | Cardiac MRI tensor | Relative Reconstruction Error0.409 | 63 | |
| Tensor Decomposition | Hyperspectral image tensor | Avg Iteration Time (s)2.203 | 48 | |
| Tensor Decomposition | COIL-100 | Average Iteration Time (s)10.131 | 48 | |
| Tensor Reconstruction | COIL-100 1.0 (test) | Relative Reconstruction Error0.356 | 48 | |
| Hyperspectral Tensor Reconstruction | Hyperspectral radiance image tensor (full) | Relative Reconstruction Error0.139 | 47 | |
| Kronecker regression | Synthetic Kronecker product matrices d=64 | Loss0.035 | 15 |