Arch-VQ: Discrete Architecture Representation Learning with Autoregressive Priors
About
Existing neural architecture representation learning methods focus on continuous representation learning, typically using Variational Autoencoders (VAEs) to map discrete architectures onto a continuous Gaussian distribution. However, sampling from these spaces often leads to a high percentage of invalid or duplicate neural architectures, likely due to the unnatural mapping of inherently discrete architectural space onto a continuous space. In this work, we revisit architecture representation learning from a fundamentally discrete perspective. We propose Arch-VQ, a framework that learns a discrete latent space of neural architectures using a Vector-Quantized Variational Autoencoder (VQ-VAE), and models the latent prior with an autoregressive transformer. This formulation yields discrete architecture representations that are better aligned with the underlying search space while decoupling representation learning from prior modeling. Across NASBench-101, NASBench-201, and DARTS search spaces, Arch-VQ improves the quality of generated architectures, increasing the rate of valid and unique generations by 22%, 26%, and 135%, respectively, over state-of-the-art baselines. We further show that modeling discrete embeddings autoregressively enhances downstream neural predictor performance, establishing the practical utility of this discrete formulation.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Neural Architecture Search | NAS-Bench-201 CIFAR-100 | Accuracy69.47 | 24 | |
| Architecture Generation | NAS-Bench-201 | Validity97.52 | 8 | |
| Architecture Representation Quality | NAS-Bench-101 1.0 (test) | Validity89.73 | 7 | |
| Neural Architecture Search | NAS-Bench-201 CIFAR-10 | Max Accuracy94.37 | 5 | |
| Architecture Generation | DARTS | Validity99.46 | 3 |