Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Relational In-Context Learning via Synthetic Pre-training with Structural Prior

About

Relational Databases (RDBs) are the backbone of modern business, yet they lack foundation models comparable to those in text or vision. A key obstacle is that high-quality RDBs are private, scarce, and structurally heterogeneous, making internet-scale pre-training infeasible. To overcome this data scarcity, we introduce RDB-PFN, the first relational foundation model trained purely via synthetic data. Inspired by Prior-Data Fitted Networks (PFNs), where synthetic data generated from Structural Causal Models (SCMs) enables reasoning on single tables, we design a Relational Prior Generator to create an infinite stream of diverse RDBs from scratch. Pre-training on over 2 million synthetic single-table and relational tasks, RDB-PFN learns to adapt to any new database instantly via genuine in-context learning. Experiments show that RDB-PFN achieves strong few-shot performance on 19 real-world relational prediction tasks, outperforming state-of-the-art tabular foundation models evaluated on the same DFS-linearized inputs, while using a lightweight architecture and fast inference. The code is available at https://github.com/MuLabPKU/RDBPFN.

Yanbo Wang, Jiaxuan You, Chuan Shi, Muhan Zhang• 2026

Related benchmarks

TaskDatasetResultRank
User Clicks Predictionrel-avito
ROC-AUC62.47
84
Driver Top 3 Predictionrel-f1
ROC-AUC80.9
70
User Engagement Predictionrel-stack
ROC-AUC82.55
69
Driver DNF Predictionrel-f1
ROC-AUC0.7171
67
Item Churn Predictionrel-amazon
ROC-AUC75.7
64
User Churn PredictionAmazon Rel
ROC-AUC0.6103
64
User Churn Predictionrel-hm
ROC-AUC62.79
62
Study Outcome Predictionrel (trial)
ROC-AUC0.5858
52
User Repeat PredictionRel Event
ROC-AUC66.29
50
User Ignore PredictionRel Event
ROC-AUC0.8073
50
Showing 10 of 29 rows

Other info

Follow for update