Fed-Sophia: A Communication-Efficient Second-Order Federated Learning Algorithm
About
Federated learning is a machine learning approach where multiple devices collaboratively learn with the help of a parameter server by sharing only their local updates. While gradient-based optimization techniques are widely adopted in this domain, the curvature information that second-order methods exhibit is crucial to guide and speed up the convergence. This paper introduces a scalable second-order method, allowing the adoption of curvature information in federated large models. Our method, coined Fed-Sophia, combines a weighted moving average of the gradient with a clipping operation to find the descent direction. In addition to that, a lightweight estimation of the Hessian's diagonal is used to incorporate the curvature information. Numerical evaluation shows the superiority, robustness, and scalability of the proposed Fed-Sophia scheme compared to first and second-order baselines.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | Tiny ImageNet (test) | Accuracy24.46 | 265 | |
| Image Classification | CIFAR-100 IID | Accuracy62.56 | 37 | |
| Image Classification | Tiny-ImageNet Dirichlet-0.05 (test) | Accuracy36.65 | 32 | |
| Image Classification | CIFAR-100 Dir-0.1 | Accuracy57.29 | 28 | |
| Image Classification | Tiny-ImageNet (iid) | Accuracy49.86 | 12 | |
| Image Classification | CIFAR-100 Dir-0.05 | Accuracy51.02 | 12 | |
| Image Classification | Tiny-ImageNet Dir-0.1 | Accuracy41.89 | 12 | |
| Image Classification | Tiny-ImageNet Dir-0.5 | Accuracy47.62 | 12 | |
| Image Classification | CIFAR-100 Dir-0.5 | Accuracy60.62 | 12 |