UnICORNN: A recurrent model for learning very long time dependencies

About

The design of recurrent neural networks (RNNs) to accurately process sequential inputs with long-time dependencies is very challenging on account of the exploding and vanishing gradient problem. To overcome this, we propose a novel RNN architecture which is based on a structure preserving discretization of a Hamiltonian system of second-order ordinary differential equations that models networks of oscillators. The resulting RNN is fast, invertible (in time), memory efficient and we derive rigorous bounds on the hidden state gradients to prove the mitigation of the exploding and vanishing gradient problem. A suite of experiments are presented to demonstrate that the proposed RNN provides state of the art performance on a variety of learning tasks with (very) long-time dependencies.

T. Konstantin Rusch, Siddhartha Mishra• 2021

Related benchmarks

Task	Dataset	Result
Sentiment Analysis	IMDB (test)	Accuracy88.4	306
Sequential Image Classification	S-MNIST (test)	Accuracy98.4	70
Sequential Image Classification	Sequential CIFAR10	Accuracy62.4	60
Pixel-level 1-D image classification	Sequential MNIST (test)	Accuracy98.4	53
Image Classification	noise padded CIFAR-10 (test)	Test Accuracy62.4	21
Permuted Sequential Image Classification	PS-MNIST (test)	Accuracy98.4	18
Vital signs prediction	BDIMC healthcare datasets	RR RMSE1.06	18
Speech Classification	Speech Commands MFCC (test)	Accuracy90.64	16
Heart-rate prediction	PPG data TSR archive (test)	Test L2 Error1.31	13
Respiratory Rate Prediction	Beth Israel Deaconess Medical Center TSR archive (test)	L2 Error1.06	12

Showing 10 of 16 rows

Other info

Code

Follow for update

@wizwand_team Discord