Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Long Expressive Memory for Sequence Modeling

About

We propose a novel method called Long Expressive Memory (LEM) for learning long-term sequential dependencies. LEM is gradient-based, it can efficiently process sequential tasks with very long-term dependencies, and it is sufficiently expressive to be able to learn complicated input-output maps. To derive LEM, we consider a system of multiscale ordinary differential equations, as well as a suitable time-discretization of this system. For LEM, we derive rigorous bounds to show the mitigation of the exploding and vanishing gradients problem, a well-known challenge for gradient-based recurrent sequential learning methods. We also prove that LEM can approximate a large class of dynamical systems to high accuracy. Our empirical results, ranging from image and time-series classification through dynamical systems prediction to speech recognition and language modeling, demonstrate that LEM outperforms state-of-the-art recurrent neural networks, gated recurrent units, and long short-term memory models.

T. Konstantin Rusch, Siddhartha Mishra, N. Benjamin Erichson, Michael W. Mahoney• 2021

Related benchmarks

TaskDatasetResultRank
Character-level PredictionPTB (test)
BPC (Test)1.25
42
Sequential Image ClassificationMNIST ordered pixel-by-pixel 1.0 (test)
Accuracy96.6
32
Dynamical systems reconstructionLorenz-63 3d
Dstsp0.39
23
Keyword SpottingGoogle Speech Commands Google12 V2 (test)
Accuracy95.7
22
Word-level predictionPTB word-level (test)
Perplexity72.8
19
Sequential Image RecognitionsMNIST
Test Accuracy99.5
16
Heart-rate predictionPPG data TSR archive (test)
Test L2 Error0.85
13
Sequential Image RecognitionnCIFAR-10
Test Accuracy60.5
8
Dynamical systems reconstructionLorenz-96 20d
Dstsp7.2
8
Dynamical systems reconstructionECG
Dstsp16.3
7
Showing 10 of 20 rows

Other info

Code

Follow for update