Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Spectral Vision Transformer for Efficient Tokenization with Limited Data

About

We propose a novel spectral vision transformer architecture for efficient tokenization in limited data, with an emphasis on medical imaging. We outline convenient theoretical properties arising from the choice of basis including spatial invariance and optimal signal-to-noise ratio. We show reduced complexity arising from the spectral projection compared to spatial vision transformers. We show equitable or superior performance with a reduced number of parameters as compared to a variety of models including compact and standard vision transformers, convolutional neural networks with attention, shifted window transformers, multi-layer perceptrons, and logistic regression. We include simulated, public, and clinical data in our analysis and release our code at: \verb+github.com/agr78/spectralViT+.

Alexandra G. Roberts, Maneesh John, Jinwei Zhang, Dominick Romano, Mert Sisman, Ki Sueng Choi, Heejong Kim, Mert R. Sabuncu, Thanh D. Nguyen, Alexey V. Dimov, Pascal Spincemaille, Brian H. Kopell, Yi Wang• 2026

Related benchmarks

TaskDatasetResultRank
Neurostimulation candidate classificationClinical Deep Brain Stimulation (DBS) (external test)
Balanced Acc78.3
6
Sex ClassificationIXI
Accuracy79.5
5
Showing 2 of 2 rows

Other info

Follow for update