Sequential Neural Models with Stochastic Layers

About

How can we efficiently propagate uncertainty in a latent state representation with recurrent neural networks? This paper introduces stochastic recurrent neural networks which glue a deterministic recurrent neural network and a state space model together to form a stochastic and sequential neural generative model. The clear separation of deterministic and stochastic layers allows a structured variational inference network to track the factorization of the model's posterior distribution. By retaining both the nonlinear recursive structure of a recurrent neural network and averaging over the uncertainty in a latent path, like a state space model, we improve the state of the art results on the Blizzard and TIMIT speech modeling data sets by a large margin, while achieving comparable performances to competing methods on polyphonic music modeling.

Marco Fraccaro, S{\o}ren Kaae S{\o}nderby, Ulrich Paquet, Ole Winther• 2016

Related benchmarks

Task	Dataset	Result
Polyphonic music modeling	JSB Chorales	Negative Log-Likelihood (nats)4.74	14
Polyphonic music modeling	Nottingham (Nott)	NLL (nats)2.94	14
Polyphonic music modeling	MuseData (Muse)	Negative Log-Likelihood (nats)6.28	12
Polyphonic music modeling	Piano-midi.de	NLL (nats)8.2	12
Generative Modeling	Human Motion Capture h3.6m	Log Likelihood2.94	10
Generative Modeling	WSJ0 Audio Spectrogram	Log P(x)1.94	10
Interpolation	Human Motion Capture h3.6m	FID (0.0-0.8)43.5	10
Interpolation	WSJ0 Audio Spectrogram	Interpolation FID (0.0-0.8)19.4	10

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord