On Multiplicative Integration with Recurrent Neural Networks

About

We introduce a general and simple structural design called Multiplicative Integration (MI) to improve recurrent neural networks (RNNs). MI changes the way in which information from difference sources flows and is integrated in the computational building block of an RNN, while introducing almost no extra parameters. The new structure can be easily embedded into many popular RNN models, including LSTMs and GRUs. We empirically analyze its learning behaviour and conduct evaluations on several tasks using different RNN models. Our experimental results demonstrate that Multiplicative Integration can provide a substantial performance boost over many of the existing RNN models.

Yuhuai Wu, Saizheng Zhang, Ying Zhang, Yoshua Bengio, Ruslan Salakhutdinov• 2016

Related benchmarks

Task	Dataset	Result
Character-level Language Modeling	enwik8 (test)	BPC1.44	195
Speech Recognition	WSJ (92-eval)	WER8.2	131
Character-level Language Modeling	text8 (test)	BPC1.44	128
Character-level Language Modeling	Penn Treebank (test)	BPC1.39	113
Character-level Language Modeling	Hutter Prize Wikipedia (test)	Bits/Char1.44	28
Byte-size token prediction	Byte-size token prediction dataset (val)	BPC1.44	7

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord