Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Group-based Learning of Disentangled Representations with Generalizability for Novel Contents

About

Sensory data are often comprised of independent content and transformation factors. For example, face images may have shapes as content and poses as transformation. To infer separately these factors from given data, various ``disentangling'' models have been proposed. However, many of these are supervised or semi-supervised, either requiring attribute labels that are often unavailable or disallowing for generalization over new contents. In this study, we introduce a novel deep generative model, called group-based variational autoencoders. In this, we assume no explicit labels, but a weaker form of structure that groups together data instances having the same content but transformed differently; we thereby separately estimate a group-common factor as content and an instance-specific factor as transformation. This approach allows for learning to represent a general continuous space of contents, which can accommodate unseen contents. Despite the simplicity, our model succeeded in learning, from five datasets, content representations that are highly separate from the transformation representation and generalizable to data with novel contents. We further provide detailed analysis of the latent content code and show insight into how our model obtains the notable transformation invariance and content generalizability.

Haruo Hosoya• 2018

Related benchmarks

TaskDatasetResultRank
FoV regressionCars3D (all)
R2 Score0.987
55
Disentangled Representation LearningCars3D
FactorVAE0.877
35
DisentanglementShapes3D--
18
Abstract Visual ReasoningAbstract Visual Reasoning WReN (10^2 samples)
Accuracy18
15
DisentanglementShapes3D
BetaVAE Score1
13
DisentanglementMPI3D
BetaVAE Score0.704
13
Abstract Visual ReasoningAbstract Visual Reasoning 10^4 samples WReN
Classification Accuracy68.4
5
Abstract Visual ReasoningAbstract Visual Reasoning dataset WReN
Accuracy23.7
5
Abstract Visual ReasoningAbstract Visual Reasoning WReN (10^5 samples)
Accuracy93.6
5
Abstract Visual ReasoningAbstract Visual Reasoning WReN
Accuracy28.7
4
Showing 10 of 11 rows

Other info

Follow for update