Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SOM-VQ: Topology-Aware Tokenization for Interactive Generative Models

About

Vector-quantized representations enable powerful discrete generative models but lack semantic structure in token space, limiting interpretable human control. We introduce SOM-VQ, a tokenization method that combines vector quantization with Self-Organizing Maps to learn discrete codebooks with explicit low-dimensional topology. Unlike standard VQ-VAE, SOM-VQ uses topology-aware updates that preserve neighborhood structure: nearby tokens on a learned grid correspond to semantically similar states, enabling direct geometric manipulation of the latent space. We demonstrate that SOM-VQ produces more learnable token sequences in the evaluated domains while providing an explicit navigable geometry in code space. Critically, the topological organization enables intuitive human-in-the-loop control: users can steer generation by manipulating distances in token space, achieving semantic alignment without frame-level constraints. We focus on human motion generation - a domain where kinematic structure, smooth temporal continuity, and interactive use cases (choreography, rehabilitation, HCI) make topology-aware control especially natural - demonstrating controlled divergence and convergence from reference sequences through simple grid-based sampling. SOM-VQ provides a general framework for interpretable discrete representations applicable to music, gesture, and other interactive generative domains.

Alessandro Londei, Denise Lanzieri, Matteo Benati• 2026

Related benchmarks

TaskDatasetResultRank
Vector QuantizationLorenz
Act (%)71
4
Vector QuantizationAIST++
Activation (%)61
4
Vector Quantization / Latent Space ModelingLorenz attractor standard (val)
Accuracy0.71
4
Vector Quantization / Latent Space ModelingAIST++ standard (val)
Activation Percentage61
4
Showing 4 of 4 rows

Other info

Follow for update