Quasi-Recurrent Neural Networks

About

Recurrent neural networks are a powerful tool for modeling sequential data, but the dependence of each timestep's computation on the previous timestep's output limits parallelism and makes RNNs unwieldy for very long sequences. We introduce quasi-recurrent neural networks (QRNNs), an approach to neural sequence modeling that alternates convolutional layers, which apply in parallel across timesteps, and a minimalist recurrent pooling function that applies in parallel across channels. Despite lacking trainable recurrent layers, stacked QRNNs have better predictive accuracy than stacked LSTMs of the same hidden size. Due to their increased parallelism, they are up to 16 times faster at train and test time. Experiments on language modeling, sentiment classification, and character-level neural machine translation demonstrate these advantages and underline the viability of QRNNs as a basic building block for a variety of sequence tasks.

James Bradbury, Stephen Merity, Caiming Xiong, Richard Socher• 2016

Related benchmarks

Task	Dataset	Result
Language Modeling	WikiText-103 (test)	Perplexity33	703
Language Modeling	Penn Treebank (test)	Perplexity58.43	420
Language Modeling	WikiText2 v1 (test)	Perplexity66.61	383
Question Answering	SQuAD v1.1 (dev)	F1 Score79.6	380
Sentiment Analysis	IMDB (test)	Accuracy91.4	306
Language Modeling	WikiText-103 (val)	PPL32	261
Character-level Language Modeling	enwik8 (test)	BPC1.38	195
Language Modeling	Penn Treebank (val)	Perplexity60.38	178
Text Classification	MR (test)	Accuracy82.1	155
Subjectivity Classification	Subj (test)	Accuracy93.4	152

Showing 10 of 20 rows

Other info

Follow for update

@wizwand_team Discord