Understanding disentangling in $\beta$-VAE

About

We present new intuitions and theoretical assessments of the emergence of disentangled representation in variational autoencoders. Taking a rate-distortion theory perspective, we show the circumstances under which representations aligned with the underlying generative factors of variation of data emerge when optimising the modified ELBO bound in $\beta$-VAE, as training progresses. From these insights, we propose a modification to the training regime of $\beta$-VAE, that progressively increases the information capacity of the latent code during training. This modification facilitates the robust learning of disentangled representations in $\beta$-VAE, without the previous trade-off in reconstruction accuracy.

Christopher P. Burgess, Irina Higgins, Arka Pal, Loic Matthey, Nick Watters, Guillaume Desjardins, Alexander Lerchner• 2018

Related benchmarks

Task	Dataset	Result
Image Reconstruction	CelebA-HQ (test)	FID (Reconstruction)145.3	50
Disentanglement	MPI3D (test)	DCI30.95	26
Disentanglement	SmallNORB (test)	DCI33.14	17
Disentanglement Analysis	MPI3D complex	--	14
Image Classification	CelebA-HQ (test)	F1 Score68.94	13
Disentanglement	CelebA-HQ (test)	Disentanglement42.99	13
Disentanglement Analysis	CUB (full)	Disentanglement37.53	8
Disentanglement Analysis	CUB (test)	DCI (average)46.94	8
Disentanglement Analysis	MPI3D Toy	Disen.18.43	8
Disentanglement Analysis	MPI3D realistic	Disentanglement Score18.76	8

Showing 10 of 18 rows

Other info

Follow for update

@wizwand_team Discord