Task Scarcity and Label Leakage in Relational Transfer Learning
About
Training relational foundation models requires learning representations that transfer across tasks, yet available supervision is typically limited to a small number of prediction targets per database. This task scarcity causes learned representations to encode task-specific shortcuts that degrade transfer even within the same schema, a problem we call label leakage. We study this using K-Space, a modular architecture combining frozen pretrained tabular encoders with a lightweight message-passing core. To suppress leakage, we introduce a gradient projection method that removes label-predictive directions from representation updates. On RelBench, this improves within-dataset transfer by +0.145 AUROC on average, often recovering near single-task performance. Our results suggest that limited task diversity, not just limited data, constrains relational foundation models.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| User Clicks Prediction | rel-avito | ROC-AUC64.6 | 84 | |
| User Engagement Prediction | rel-stack | ROC-AUC84.4 | 69 | |
| Driver DNF Prediction | rel-f1 | ROC-AUC0.735 | 54 | |
| Driver Top 3 Prediction | rel-f1 | ROC-AUC82.9 | 54 | |
| Item Churn Prediction | rel-amazon | ROC-AUC77.8 | 54 | |
| User Churn Prediction | Amazon Rel | ROC-AUC0.644 | 54 | |
| Study Outcome Prediction | rel (trial) | ROC-AUC0.651 | 52 | |
| User Churn Prediction | rel-hm | ROC-AUC67.4 | 52 | |
| User Ignore Prediction | Rel Event | ROC-AUC0.858 | 50 | |
| User Repeat Prediction | Rel Event | ROC-AUC73.6 | 50 |