Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

VIP-COP: Context Optimization for Tabular Foundation Models

About

Tabular foundation models (TFMs) have emerged as a powerful paradigm for in-context learning on structured data, enabling direct prediction on new tabular tasks without task-specific training. However, their effectiveness is constrained by context length limits, restricting application to medium-scale data and degrading performance when inference-time data exceed pretraining size distributions. Our work introduces VIP-COP, estimating the Value of Importance for Prediction of training examples and features for hard Context OPtimization for TFMs. Its explicit selection mechanism suppresses noise and isolates influential data, enabling the model to also benefit from data augmentation by prioritizing high-value augmented samples and features. VIP-COP is (i) fast, boosting performance often within minutes of optimization, based on an online KernelSHAP-based regression with iterative refinement, value-guided context sampling, and multi-fidelity pruning; (ii) budget-aware and any-time, improving with additional test-time compute unlike heuristics that produce fixed contexts; (iii) model-aware yet fully black-box, requiring no access to model internals, making it compatible with both proprietary and open-source TFMs; (iv) interpretable, identifying discrete ``Very Important Predictors'' (samples and features) that maximize signal-to-noise, which makes it (v) robust, isolating high-value data from noise. In contrast, soft-prompt optimization requires model gradients, produces abstract latent tokens, and lacks explicit signal discrimination. Extensive experiments show that VIP-COP consistently outperforms heuristic and optimized baselines across large-scale high-dimensional testbeds, including data augmentation and data-noise settings, establishing a new state of the art in test-time context refinement for TFMs.

Yilong Chen, Xueying Ding, Leman Akoglu• 2026

Related benchmarks

TaskDatasetResultRank
ClassificationElectricity--
27
Classificationtic-tac-toe
ROC-AUC71.7
21
Tabular multi-class classificationHardCOp 38 datasets Original--
20
Binary ClassificationElectricity
AUC76.1
18
Binary ClassificationBank
AUC76.4
16
ClassificationShuttle
Balanced Accuracy99.7
14
ClassificationBank
Balanced Accuracy79.4
14
Tabular ClassificationHardCOp4DN S2noi
Performance Score68.1
12
ClassificationHardCOp - Amazon_employee_access original (test)
Balanced Accuracy93.4
12
Tabular multi-class classificationHardCOp4DA 38 datasets--
10
Showing 10 of 104 rows
...

Other info

Follow for update