DR-DSGD: A Distributionally Robust Decentralized Learning Algorithm over Graphs

About

In this paper, we propose to solve a regularized distributionally robust learning problem in the decentralized setting, taking into account the data distribution shift. By adding a Kullback-Liebler regularization function to the robust min-max optimization problem, the learning problem can be reduced to a modified robust minimization problem and solved efficiently. Leveraging the newly formulated optimization problem, we propose a robust version of Decentralized Stochastic Gradient Descent (DSGD), coined Distributionally Robust Decentralized Stochastic Gradient Descent (DR-DSGD). Under some mild assumptions and provided that the regularization parameter is larger than one, we theoretically prove that DR-DSGD achieves a convergence rate of $\mathcal{O}\left(1/\sqrt{KT} + K/T\right)$, where $K$ is the number of devices and $T$ is the number of iterations. Simulation results show that our proposed algorithm can improve the worst distribution test accuracy by up to $10\%$. Moreover, DR-DSGD is more communication-efficient than DSGD since it requires fewer communication rounds (up to $20$ times less) to achieve the same worst distribution test accuracy target. Furthermore, the conducted experiments reveal that DR-DSGD results in a fairer performance across devices in terms of test accuracy.

Chaouki Ben Issaid, Anis Elgabli, Mehdi Bennis• 2022

Related benchmarks

Task	Dataset	Result
Regression	PovertyMap (test)	Worst-U/R Pearson Correlation0.7155	43
Wildlife Species Classification	WILDS-iWildCam ID (test)	Macro F131.57	23
Image Classification	Cifar10 Dirichlet(0.3) (test)	--	21
Toxicity Classification	CivilComments (CC) (test)	Worst-Group Accuracy62.72	13
Tumor Detection	CAMELYON17 (test)	Accuracy92.7	9
Language Modeling	Pile uncopyrighted (test)	Worst Log-Perplexity8.023	9
Image Classification	Cifar10 Dirichlet(10) (test)	Worst Accuracy37	9

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord