Test-Time Robust Personalization for Federated Learning
About
Federated Learning (FL) is a machine learning paradigm where many clients collaboratively learn a shared global model with decentralized training data. Personalized FL additionally adapts the global model to different clients, achieving promising results on consistent local training and test distributions. However, for real-world personalized FL applications, it is crucial to go one step further: robustifying FL models under the evolving local test set during deployment, where various distribution shifts can arise. In this work, we identify the pitfalls of existing works under test-time distribution shifts and propose Federated Test-time Head Ensemble plus tuning(FedTHE+), which personalizes FL models with robustness to various test-time distribution shifts. We illustrate the advancement of FedTHE+ (and its computationally efficient variant FedTHE) over strong competitors, by training various neural architectures (CNN, ResNet, and Transformer) on CIFAR10 andImageNet with various test distributions. Along with this, we build a benchmark for assessing the performance and robustness of personalized FL methods during deployment. Code: https://github.com/LINs-lab/FedTHE.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Domain Generalization | PACS | Accuracy (Art)96.17 | 221 | |
| Domain Generalization | PACS (leave-one-domain-out) | Art Accuracy96.17 | 146 | |
| Out-of-Distribution Detection | CIFAR-10 (ID) vs SVHN (OOD) (test) | AUROC85.95 | 79 | |
| OOD Detection | CIFAR-10 IND iSUN OOD | AUROC83.5 | 42 | |
| OOD Detection | Textures (OOD) with CIFAR-10 (ID) (test) | FPR@9553.58 | 40 | |
| Out-of-Distribution Detection | CIFAR10 (ID) vs SVHN (OOD) | AUROC83.5 | 37 | |
| Federated Image Classification | CIFAR-100 and CIFAR-100-C brightness (test) | Accuracy (In-Distribution)0.7383 | 33 | |
| Federated Out-of-Distribution Detection | CIFAR-100 (ID) and LSUN-C (OOD) (test) | FPR@9564.73 | 33 | |
| Out-of-Distribution Detection | LSUN (Out-of-distribution) vs CIFAR-10 (In-distribution) | AUROC83.55 | 28 | |
| OOD Detection | CIFAR-10 (In-distribution) vs LSUN-R (Out-of-distribution) | FPR9534.94 | 25 |