Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

UnICORNN: A recurrent model for learning very long time dependencies

About

The design of recurrent neural networks (RNNs) to accurately process sequential inputs with long-time dependencies is very challenging on account of the exploding and vanishing gradient problem. To overcome this, we propose a novel RNN architecture which is based on a structure preserving discretization of a Hamiltonian system of second-order ordinary differential equations that models networks of oscillators. The resulting RNN is fast, invertible (in time), memory efficient and we derive rigorous bounds on the hidden state gradients to prove the mitigation of the exploding and vanishing gradient problem. A suite of experiments are presented to demonstrate that the proposed RNN provides state of the art performance on a variety of learning tasks with (very) long-time dependencies.

T. Konstantin Rusch, Siddhartha Mishra• 2021

Related benchmarks

TaskDatasetResultRank
Sentiment AnalysisIMDB (test)
Accuracy88.4
248
Sequential Image ClassificationS-MNIST (test)
Accuracy98.4
70
Pixel-level 1-D image classificationSequential MNIST (test)
Accuracy98.4
53
Sequential Image ClassificationSequential CIFAR10
Accuracy62.4
48
Image Classificationnoise padded CIFAR-10 (test)
Test Accuracy62.4
21
Permuted Sequential Image ClassificationPS-MNIST (test)
Accuracy98.4
18
Vital signs predictionBDIMC healthcare datasets
RR RMSE1.06
18
Speech ClassificationSpeech Commands MFCC (test)
Accuracy90.64
16
Heart-rate predictionPPG data TSR archive (test)
Test L2 Error1.31
13
Respiratory Rate PredictionBeth Israel Deaconess Medical Center TSR archive (test)
L2 Error1.06
12
Showing 10 of 16 rows

Other info

Code

Follow for update