FAAST: Forward-Only Associative Learning via Closed-Form Fast Weights for Test-Time Supervised Adaptation
About
Adapting pretrained models typically involves a trade-off between the high training costs of backpropagation and the heavy inference overhead of memory-based or in-context learning. We propose FAAST, a forward-only associative adaptation method that analytically compiles labeled examples into fast weights in a single pass. By eliminating memory or context dependence, FAAST achieves constant-time inference and decouples task adaptation from pretrained representation. Across image classification and language modeling benchmarks, FAAST matches or exceeds backprop-based adaptation while reducing adaptation time by over 90% and is competitive to memory/context-based adaptation while saving memory usage by up to 95%. These results demonstrate FAAST as a highly efficient, scalable solution for supervised task adaptation, particularly for resource-constrained models. We release the code and models at https://github.com/baoguangsheng/faast.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Language Modeling | WikiText-103 | PPL13.23 | 216 | |
| Machine Translation | IWSLT en-de 2017 (test) | BLEU37.1 | 46 | |
| Image Classification | CIFAR-10 (full) | Top-1 Acc86.7 | 29 | |
| Machine Translation | IWSLT Fr-En 2017 (test) | BLEU43.93 | 22 | |
| Machine Translation | IWSLT En-Fr 2017 (test) | BLEU37.08 | 11 | |
| Sentiment Classification | SST-2 (All) | Accuracy87.5 | 7 | |
| Image Classification | CIFAR10 10-way 5-shot | Accuracy73.8 | 6 | |
| Image Classification | miniImageNet 20-way 5-shot | Accuracy88.6 | 6 | |
| Image Classification | miniImageNet 20-way (Full) | Accuracy93 | 6 | |
| Sentiment Classification | SST-2 1-shot | Accuracy78.5 | 2 |