Who Said Neural Networks Aren't Linear?
About
Neural networks are famously nonlinear. However, linearity is defined relative to a pair of vector spaces, $f:X \to Y$. Leveraging the algebraic concept of transport of structure, we propose a method to explicitly identify non-standard vector spaces where a neural network acts as a linear operator. When sandwiching a linear operator $A$ between two invertible neural networks, $f(x)=g_y^{-1}(A g_x(x))$, the corresponding vector spaces $X$ and $Y$ are induced by newly defined addition and scaling actions derived from $g_x$ and $g_y$. We term this kind of architecture a Linearizer. This framework makes the entire arsenal of linear algebra, including SVD, pseudo-inverse, orthogonal projection and more, applicable to nonlinear mappings. Furthermore, we show that the composition of two Linearizers that share a neural network is also a Linearizer. We leverage this property and demonstrate that training diffusion models using our architecture makes the hundreds of sampling steps collapse into a single step. We further utilize our framework to enforce idempotency (i.e. $f(f(x))=f(x)$) on networks leading to a globally projective generative model and to demonstrate modular style transfer.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Weather forecasting | Weather 96 horizon | MSE0.167 | 3 | |
| Weather forecasting | Weather 192 horizon | MSE0.224 | 3 | |
| Weather forecasting | Weather 336 horizon | MSE0.257 | 3 | |
| Weather forecasting | Weather 720 horizon | MSE0.368 | 3 |