Topographic VAEs learn Equivariant Capsules

About

In this work we seek to bridge the concepts of topographic organization and equivariance in neural networks. To accomplish this, we introduce the Topographic VAE: a novel method for efficiently training deep generative models with topographically organized latent variables. We show that such a model indeed learns to organize its activations according to salient characteristics such as digit class, width, and style on MNIST. Furthermore, through topographic organization over time (i.e. temporal coherence), we demonstrate how predefined latent space transformation operators can be encouraged for observed transformed input sequences -- a primitive form of unsupervised learned equivariance. We demonstrate that this model successfully learns sets of approximately equivariant features (i.e. "capsules") directly from sequences and achieves higher likelihood on correspondingly transforming test sequences. Equivariance is verified quantitatively by measuring the approximate commutativity of the inference network and the sequence transformations. Finally, we demonstrate approximate equivariance to complex transformations, expanding upon the capabilities of existing group equivariant neural networks.

T. Anderson Keller, Max Welling• 2021

Related benchmarks

Task	Dataset	Result
Representation Learning	MNIST	Scaling Equivariance Error505.2	9
Disentanglement	Shapes3D (10% train)	VP Score88.27	5
Disentanglement	Shapes3D 1% (train)	VP Score68.39	5
Disentanglement	MNIST (10% train)	VP Score89.91	5
Disentanglement	MNIST (1% train set)	VP Score88.15	5
Equivariance Error Measurement	Issac3D	Robot X-move Error8.44e+3	4
Unsupervised Disentanglement	Falcor3D	Lighting Intensity1.15e+4	4

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord