Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Privacy Preserving Vertical Federated Learning for Tree-based Models

About

Federated learning (FL) is an emerging paradigm that enables multiple organizations to jointly train a model without revealing their private data to each other. This paper studies {\it vertical} federated learning, which tackles the scenarios where (i) collaborating organizations own data of the same set of users but with disjoint features, and (ii) only one organization holds the labels. We propose Pivot, a novel solution for privacy preserving vertical decision tree training and prediction, ensuring that no intermediate information is disclosed other than those the clients have agreed to release (i.e., the final tree model and the prediction output). Pivot does not rely on any trusted third party and provides protection against a semi-honest adversary that may compromise $m-1$ out of $m$ clients. We further identify two privacy leakages when the trained decision tree model is released in plaintext and propose an enhanced protocol to mitigate them. The proposed solution can also be extended to tree ensemble models, e.g., random forest (RF) and gradient boosting decision tree (GBDT) by treating single decision trees as building blocks. Theoretical and experimental analysis suggest that Pivot is efficient for the privacy achieved.

Yuncheng Wu, Shaofeng Cai, Xiaokui Xiao, Gang Chen, Beng Chin Ooi• 2020

Related benchmarks

TaskDatasetResultRank
ClassificationSKINNONSKIN
F1-Score74.3
7
Binary ClassificationBreast cancer
F1 Score91.9
5
Binary Classificationa9a
F1 Score65.3
5
Binary Classificationcod-rna
F1 Score40.8
5
Binary Classificationcovtype binary
F1 Score57.2
5
Binary ClassificationPhishing
F1 Score95.7
5
Private GBDT TrainingSynthetic LAN (n=5x10^4, D=4, B=8, m0=8, m1=7)
Training Time (s)1.68e+3
3
Private GBDT TrainingSynthetic LAN (n=2x10^5, D=4, B=8, m0=8, m1=7)
Training Time (s)448
3
Showing 8 of 8 rows

Other info

Follow for update