Distributed Non-Convex Optimization with One-Bit Compressors on Heterogeneous Data: Efficient and Resilient Algorithms

TOP Literature Database Distributed Non-Convex Optimization with One-Bit Compressors on Heterogeneous Data: Efficient and Resilient Algorithms

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2210.00665

PDF

https://arxiv.org/pdf/2210.00665

Paper Information

Author: Ming Xiang;Lili Su
Published: 10-3-2022
Updated: 2-18-2023
Affiliation: Department of Electrical and Computer Engineering, Northeastern University
Country: United States of America
Conference

Labels Estimated by AI

Algorithm Convergence Guarantee Algorithm Design

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Federated Learning (FL) is a nascent decentralized learning framework under which a massive collection of heterogeneous clients collaboratively train a model without revealing their local data. Scarce communication, privacy leakage, and Byzantine attacks are the key bottlenecks of system scalability. In this paper, we focus on communication-efficient distributed (stochastic) gradient descent for non-convex optimization, a driving force of FL. We propose two algorithms, named {\em Adaptive Stochastic Sign SGD (Ada-StoSign)} and {\em $\beta$-Stochastic Sign SGD ($\beta$-StoSign)}, each of which compresses the local gradients into bit vectors. To handle unbounded gradients, Ada-StoSign uses a novel norm tracking function that adaptively adjusts a coarse estimation on the $\ell_{\infty}$ of the local gradients - a key parameter used in gradient compression. We show that Ada-StoSign converges in expectation with a rate $O(\log T/\sqrt{T} + 1/\sqrt{M})$, where $M$ is the number of clients. To the best of our knowledge, when $M$ is sufficiently large, Ada-StoSign outperforms the state-of-the-art sign-based method whose convergence rate is $O(T^{-1/4})$. Under bounded gradient assumption, $\beta$-StoSign achieves quantifiable Byzantine resilience and privacy assurances, and works with partial client participation and mini-batch gradients which could be unbounded. We corroborate and complement our theories by experiments on MNIST and CIFAR-10 datasets.

External Datasets

MNIST

CIFAR-10