Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

General-purpose, long-context autoregressive modeling with Perceiver AR

About

Real-world data is high-dimensional: a book, image, or musical performance can easily contain hundreds of thousands of elements even after compression. However, the most commonly used autoregressive models, Transformers, are prohibitively expensive to scale to the number of inputs and layers needed to capture this long-range structure. We develop Perceiver AR, an autoregressive, modality-agnostic architecture which uses cross-attention to map long-range inputs to a small number of latents while also maintaining end-to-end causal masking. Perceiver AR can directly attend to over a hundred thousand tokens, enabling practical long-context density estimation without the need for hand-crafted sparsity patterns or memory mechanisms. When trained on images or music, Perceiver AR generates outputs with clear long-term coherence and structure. Our architecture also obtains state-of-the-art likelihood on long-sequence benchmarks, including 64 x 64 ImageNet images and PG-19 books.

Curtis Hawthorne, Andrew Jaegle, C\u{a}t\u{a}lina Cangea, Sebastian Borgeaud, Charlie Nash, Mateusz Malinowski, Sander Dieleman, Oriol Vinyals, Matthew Botvinick, Ian Simon, Hannah Sheahan, Neil Zeghidour, Jean-Baptiste Alayrac, Jo\~ao Carreira, Jesse Engel• 2022

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText-103 (test)
Perplexity18.25
524
Language ModelingWikiText-103 (val)
PPL17.58
180
Language ModelingPG-19 (test)
Perplexity28.9
106
Density EstimationImageNet 64x64 (test)
Bits Per Sub-Pixel3.4
62
Language ModelingPG-19 (val)
Perplexity45.9
19
Density EstimationImageNet 64x64 (val)
Bits/dim3.4
13
Long-Context Video PredictionDMLab 64x64
FVD96
12
Long-Context Video PredictionMinecraft 128x128 (test)
SSIM0.323
6
Symbolic music generationMAESTRO v1 (val)
Negative Log-Likelihood1.82
2
Symbolic music generationMAESTRO v1 (test)
Negative Log-Likelihood1.82
1
Showing 10 of 12 rows

Other info

Code

Follow for update