Federated learning (FL) is a trending training paradigm to utilize
decentralized training data. FL allows clients to update model parameters
locally for several epochs, then share them to a global model for aggregation.
This training paradigm with multi-local step updating before aggregation
exposes unique vulnerabilities to adversarial attacks. Adversarial training is
a popular and effective method to improve the robustness of networks against
adversaries. In this work, we formulate a general form of federated adversarial
learning (FAL) that is adapted from adversarial learning in the centralized
setting. On the client side of FL training, FAL has an inner loop to generate
adversarial samples for adversarial training and an outer loop to update local
model parameters. On the server side, FAL aggregates local model updates and
broadcast the aggregated model. We design a global robust training loss and
formulate FAL training as a min-max optimization problem. Unlike the
convergence analysis in classical centralized training that relies on the
gradient direction, it is significantly harder to analyze the convergence in
FAL for three reasons: 1) the complexity of min-max optimization, 2) model not
updating in the gradient direction due to the multi-local updates on the
client-side before aggregation and 3) inter-client heterogeneity. We address
these challenges by using appropriate gradient approximation and coupling
techniques and present the convergence analysis in the over-parameterized
regime. Our main result theoretically shows that the minimum loss under our
algorithm can converge to $\epsilon$ small with chosen learning rate and
communication rounds. It is noteworthy that our analysis is feasible for
non-IID clients.