Fully Distributed, Flexible Compositional Visual Representations via Soft Tensor Products

About

Since the inception of the classicalist vs. connectionist debate, it has been argued that the ability to systematically combine symbol-like entities into compositional representations is crucial for human intelligence. In connectionist systems, the field of disentanglement has gained prominence for its ability to produce explicitly compositional representations; however, it relies on a fundamentally symbolic, concatenative representation of compositional structure that clashes with the continuous, distributed foundations of deep learning. To resolve this tension, we extend Smolensky's Tensor Product Representation (TPR) and introduce Soft TPR, a representational form that encodes compositional structure in an inherently distributed, flexible manner, along with Soft TPR Autoencoder, a theoretically-principled architecture designed specifically to learn Soft TPRs. Comprehensive evaluations in the visual representation learning domain demonstrate that the Soft TPR framework consistently outperforms conventional disentanglement alternatives -- achieving state-of-the-art disentanglement, boosting representation learner convergence, and delivering superior sample efficiency and low-sample regime performance in downstream tasks. These findings highlight the promise of a distributed and flexible approach to representing compositional structure by potentially enhancing alignment with the core principles of deep learning over the conventional symbolic approach.

Bethia Sun, Maurice Pagnucco, Yang Song• 2024

Related benchmarks

Task	Dataset	Result
Disentangled Representation Learning	Cars3D	FactorVAE0.999	57
FoV regression	Cars3D (all)	R2 Score1	55
Disentanglement	Shapes3D	FactorVAE Score0.984	34
sys-bAbI task	sys-bAbI original (test)	Gap1.97	22
Abstract Visual Reasoning	Abstract Visual Reasoning WReN (10^2 samples)	Accuracy27.3	15
Disentanglement	MPI3D	BetaVAE Score1	13
Disentanglement	Shapes3D	BetaVAE Score1	13
Abstract Visual Reasoning	Abstract Visual Reasoning dataset WReN	Accuracy31.2	5
Abstract Visual Reasoning	Abstract Visual Reasoning 10^4 samples WReN	Classification Accuracy56	5
Abstract Visual Reasoning	Abstract Visual Reasoning WReN (10^5 samples)	Accuracy86.9	5

Showing 10 of 12 rows

Other info

Code

Follow for update

@wizwand_team Discord