Lipschitz Recurrent Neural Networks
About
Viewing recurrent neural networks (RNNs) as continuous-time dynamical systems, we propose a recurrent unit that describes the hidden state's evolution with two parts: a well-understood linear component plus a Lipschitz nonlinearity. This particular functional form facilitates stability analysis of the long-term behavior of the recurrent unit using tools from nonlinear systems theory. In turn, this enables architectural design decisions before experimentation. Sufficient conditions for global stability of the recurrent unit are obtained, motivating a novel scheme for constructing hidden-to-hidden matrices. Our experiments demonstrate that the Lipschitz RNN can outperform existing recurrent units on a range of benchmark tasks, including computer vision, language modeling and speech prediction tasks. Finally, through Hessian-based analysis we demonstrate that our Lipschitz recurrent unit is more robust with respect to input and parameter perturbations as compared to other continuous-time RNNs.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Pixel-by-pixel Image Classification | Permuted Sequential MNIST (pMNIST) (test) | Accuracy96.3 | 79 | |
| Sequential Image Classification | PMNIST (test) | Accuracy (Test)97.3 | 77 | |
| Sequential Image Classification | S-MNIST (test) | Accuracy99.4 | 70 | |
| Pixel-level 1-D image classification | Sequential MNIST (test) | Accuracy99.4 | 53 | |
| Permuted Sequential Image Classification | MNIST Permuted Sequential | Test Accuracy Mean96.3 | 50 | |
| Sequential Image Classification | Sequential CIFAR10 | Accuracy64.2 | 48 | |
| 1-D Pixel-level Image Classification | sCIFAR (test) | Accuracy64.2 | 46 | |
| Ordered Pixel-by-Pixel Classification | MNIST ordered pixels (test) | Accuracy99.2 | 42 | |
| Character-level Prediction | PTB (test) | BPC (Test)1.42 | 42 | |
| Pixel-by-pixel Image Classification | CIFAR-10 sequential (test) | Accuracy64.2 | 37 |