Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Revisiting Autoregressive Models for Generative Image Classification

About

Class-conditional generative models have emerged as accurate and robust classifiers, with diffusion models demonstrating clear advantages over other visual generative paradigms, including autoregressive (AR) models. In this work, we revisit visual AR-based generative classifiers and identify an important limitation of prior approaches: their reliance on a fixed token order, which imposes a restrictive inductive bias for image understanding. We observe that single-order predictions rely more on partial discriminative cues, while averaging over multiple token orders provides a more comprehensive signal. Based on this insight, we leverage recent any-order AR models to estimate order-marginalized predictions, unlocking the high classification potential of AR models. Our approach consistently outperforms diffusion-based classifiers across diverse image classification benchmarks, while being up to 25x more efficient. Compared to state-of-the-art self-supervised discriminative models, our method delivers competitive classification performance - a notable achievement for generative classifiers.

Ilia Sudakov, Artem Babenko, Dmitry Baranchuk• 2026

Related benchmarks

TaskDatasetResultRank
Image ClassificationImageNet A
Top-1 Acc23.3
654
Image ClassificationImageNet-Sketch
Top-1 Accuracy45.9
407
Image ClassificationImageNet-R
Accuracy53
217
Image ClassificationImageNet-S
Top-1 Acc45.9
92
Image ClassificationImageNet A--
50
Image ClassificationImageNet-C Gaussian Noise
Top-1 Accuracy65.2
24
Image ClassificationImageNet-C JPEG Corruptions
Top-1 Accuracy75.1
24
Image ClassificationCelebA WILDS (test)
I.I.D. Accuracy92
19
Image ClassificationCamelyon WILDS 17
ID Accuracy99.6
10
Showing 9 of 9 rows

Other info

Follow for update