Relational In-Context Learning via Synthetic Pre-training with Structural Prior
About
Relational Databases (RDBs) are the backbone of modern business, yet they lack foundation models comparable to those in text or vision. A key obstacle is that high-quality RDBs are private, scarce, and structurally heterogeneous, making internet-scale pre-training infeasible. To overcome this data scarcity, we introduce RDB-PFN, the first relational foundation model trained purely via synthetic data. Inspired by Prior-Data Fitted Networks (PFNs), where synthetic data generated from Structural Causal Models (SCMs) enables reasoning on single tables, we design a Relational Prior Generator to create an infinite stream of diverse RDBs from scratch. Pre-training on over 2 million synthetic single-table and relational tasks, RDB-PFN learns to adapt to any new database instantly via genuine in-context learning. Experiments show that RDB-PFN achieves strong few-shot performance on 19 real-world relational prediction tasks, outperforming state-of-the-art tabular foundation models evaluated on the same DFS-linearized inputs, while using a lightweight architecture and fast inference. The code is available at https://github.com/MuLabPKU/RDBPFN.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| User Clicks Prediction | rel-avito | ROC-AUC62.47 | 84 | |
| Driver Top 3 Prediction | rel-f1 | ROC-AUC80.9 | 70 | |
| User Engagement Prediction | rel-stack | ROC-AUC82.55 | 69 | |
| Driver DNF Prediction | rel-f1 | ROC-AUC0.7171 | 67 | |
| Item Churn Prediction | rel-amazon | ROC-AUC75.7 | 64 | |
| User Churn Prediction | Amazon Rel | ROC-AUC0.6103 | 64 | |
| User Churn Prediction | rel-hm | ROC-AUC62.79 | 62 | |
| Study Outcome Prediction | rel (trial) | ROC-AUC0.5858 | 52 | |
| User Repeat Prediction | Rel Event | ROC-AUC66.29 | 50 | |
| User Ignore Prediction | Rel Event | ROC-AUC0.8073 | 50 |