Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Chunked TabPFN: Exact Training-Free In-Context Learning for Long-Context Tabular Data

About

TabPFN v2 achieves better results than tree-based models on several tabular benchmarks, which is notable since tree-based models are usually the strongest choice for tabular data. However, it cannot handle more than 10K context tokens because transformers have quadratic computation and memory costs. Unlike existing approaches that rely on context compression, such as selecting representative samples via K-nearest neighbors (KNN), we introduce a tiled-block strategy to compute attention within the TabPFN framework. This design is compatible with standard GPU setups and, to the best of our knowledge, is the first to enable TabPFN to process long contexts without any pre-processing. We demonstrate the effectiveness of our approach on the standard TabArena benchmark, with code available at https://github.com/mrsergazinov/chunk_tabpfn.

Renat Sergazinov, Shao-An Yin• 2025

Related benchmarks

TaskDatasetResultRank
ClassificationAdult
Accuracy91.3
86
ClassificationDiabetes
Accuracy81.64
80
ClassificationCredit--
63
ClassificationGerman
Accuracy79.4
58
Tabular PredictionTabArena all 51 datasets
Elo Rating1.27e+3
38
ClassificationSynthetic
Accuracy91.8
31
ClassificationCensus KDD
Accuracy87.59
26
Classificationblood
ROC-AUC0.7805
23
Tabular LearningLong-context 15 datasets v2 (test)
Avg. Normalized RMSE0.534
9
Tabular ClassificationBank
Mean AUC-ROC91.92
7
Showing 10 of 15 rows

Other info

Follow for update