Little is Enough: Boosting Privacy by Sharing Only Hard Labels in Federated Semi-Supervised Learning

About

In many critical applications, sensitive data is inherently distributed and cannot be centralized due to privacy concerns. A wide range of federated learning approaches have been proposed to train models locally at each client without sharing their sensitive data, typically by exchanging model parameters, or probabilistic predictions (soft labels) on a public dataset or a combination of both. However, these methods still disclose private information and restrict local models to those that can be trained using gradient-based methods. We propose a federated co-training (FedCT) approach that improves privacy by sharing only definitive (hard) labels on a public unlabeled dataset. Clients use a consensus of these shared labels as pseudo-labels for local training. This federated co-training approach empirically enhances privacy without compromising model quality. In addition, it allows the use of local models that are not suitable for parameter aggregation in traditional federated learning, such as gradient-boosted decision trees, rule ensembles, and random forests. Furthermore, we observe that FedCT performs effectively in federated fine-tuning of large language models, where its pseudo-labeling mechanism is particularly beneficial. Empirical evaluations and theoretical analyses suggest its applicability across a range of federated learning scenarios.

Amr Abourayya, Jens Kleesiek, Kanishka Rao, Erman Ayday, Bharat Rao, Geoff Webb, Michael Kamp• 2023

Related benchmarks

Task	Dataset	Result
Image Classification	TinyImageNet (test)	Accuracy40.7	562
Image Classification	CIFAR-100	Accuracy42	375
Image Classification	CIFAR100	Accuracy34.5	301
Image Classification	CIFAR10	Accuracy (%)76.9	282
Image Classification	EMNIST (test)	Accuracy91.5	272
Image Classification	CIFAR10	Accuracy77.2	143
Image Classification	TinyImageNet	Accuracy35.9	135
Image Classification	CIFAR100 (test)	Accuracy39.3	98
Image Classification	EMNIST	Accuracy91.2	90

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord