Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Data Collaboration Analysis with Orthonormal Basis Selection and Alignment

About

Data Collaboration (DC) enables multiple parties to jointly train a model by sharing only linear projections of their private datasets. The core challenge in DC is to align the bases of these projections without revealing each party's secret basis. While existing theory suggests that any target basis spanning the common subspace should suffice, in practice, the choice of basis can substantially affect both accuracy and numerical stability. We introduce Orthonormal Data Collaboration (ODC), which enforces orthonormal secret and target bases, thereby reducing alignment to the classical Orthogonal Procrustes problem, which admits a closed-form solution. We prove that the resulting change-of-basis matrices achieve orthogonal concordance, aligning all parties' representations up to a shared orthogonal transform and rendering downstream performance invariant to the target basis. Computationally, ODC reduces the alignment complexity from O(min{a(cl)^2,a^2cl}) to O(acl^2), and empirical evaluations show up to 100 times speedups with equal or better accuracy across benchmarks. ODC preserves DC's one-round communication pattern and privacy assumptions, providing a simple and efficient drop-in improvement to existing DC pipelines.

Keiyu Nosaka, Yamato Suetake, Yuichi Takano, Akiko Yoshise• 2024

Related benchmarks

TaskDatasetResultRank
Image ClassificationMNIST
Accuracy95.9
398
Image ClassificationFashion MNIST
Accuracy86
300
ClassificationCelebA
Avg Accuracy86.2
185
ClassificationAdult
Accuracy85
21
Property PredictionAMES
ROC-AUC0.886
18
Property PredictionCYP3A4
PR-AUC82.1
18
Length-of-Stay PredictioneICU-CRD (test)
RMSE4.07
15
Property PredictionTox21 SR-ARE
ROC-AUC75.4
13
Property PredictionHIV
ROC-AUC80.8
13
Property PredictionCYP2D6
ROC-AUC83.3
13
Showing 10 of 11 rows

Other info

Follow for update