Institute for Artificial Intelligence, Tsinghua University, State Key Lab of Intelligent Technologies and Systems, Beijing National Research Center for Information Science and Technology, Department of Automation, Tsinghua University
Backpropagation (BP) is widely used for calculating gradients in deep neural
networks (DNNs). Applied often along with stochastic gradient descent (SGD) or
its variants, BP is considered as a de-facto choice in a variety of machine
learning tasks including DNN training and adversarial attack/defense. Recently,
a linear variant of BP named LinBP was introduced for generating more
transferable adversarial examples for performing black-box attacks, by Guo et
al. Although it has been shown empirically effective in black-box attacks,
theoretical studies and convergence analyses of such a method is lacking. This
paper serves as a complement and somewhat an extension to Guo et al.'s paper,
by providing theoretical analyses on LinBP in neural-network-involved learning
tasks, including adversarial attack and model training. We demonstrate that,
somewhat surprisingly, LinBP can lead to faster convergence in these tasks in
the same hyper-parameter settings, compared to BP. We confirm our theoretical
results with extensive experiments.