Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

A-FedPD: Aligning Dual-Drift is All Federated Primal-Dual Learning Needs

About

As a popular paradigm for juggling data privacy and collaborative training, federated learning (FL) is flourishing to distributively process the large scale of heterogeneous datasets on edged clients. Due to bandwidth limitations and security considerations, it ingeniously splits the original problem into multiple subproblems to be solved in parallel, which empowers primal dual solutions to great application values in FL. In this paper, we review the recent development of classical federated primal dual methods and point out a serious common defect of such methods in non-convex scenarios, which we say is a "dual drift" caused by dual hysteresis of those longstanding inactive clients under partial participation training. To further address this problem, we propose a novel Aligned Federated Primal Dual (A-FedPD) method, which constructs virtual dual updates to align global consensus and local dual variables for those protracted unparticipated local clients. Meanwhile, we provide a comprehensive analysis of the optimization and generalization efficiency for the A-FedPD method on smooth non-convex objectives, which confirms its high efficiency and practicality. Extensive experiments are conducted on several classical FL setups to validate the effectiveness of our proposed method.

Yan Sun, Li Shen, Dacheng Tao• 2024

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-10 IID
Accuracy87.44
58
Image ClassificationCIFAR-100 IID
Accuracy55.56
37
Image ClassificationCIFAR-100 Dir-0.1
Accuracy53.15
28
Image ClassificationCIFAR-10 Dir-1.0
Accuracy86.46
16
Image ClassificationCIFAR-10 Dir-0.1
Accuracy82.48
16
Image ClassificationCIFAR-100 Dir-1.0
Accuracy54.62
16
Image ClassificationCIFAR-10 (test)
Communication Rounds131
12
Image ClassificationCIFAR-100 (test)
Communication Rounds123
12
Federated LearningCIFAR-10 (train)
Time per Round (s)11.71
7
Training EfficiencyResNet architecture
Latency (s)37.61
7
Showing 10 of 10 rows

Other info

Follow for update