Neural networks have demonstrated success in various domains, yet their
performance can be significantly degraded by even a small input perturbation.
Consequently, the construction of such perturbations, known as adversarial
attacks, has gained significant attention, many of which fall within
"white-box" scenarios where we have full access to the neural network. Existing
attack algorithms, such as the projected gradient descent (PGD), commonly take
the sign function on the raw gradient before updating adversarial inputs,
thereby neglecting gradient magnitude information. In this paper, we present a
theoretical analysis of how such sign-based update algorithm influences
step-wise attack performance, as well as its caveat. We also interpret why
previous attempts of directly using raw gradients failed. Based on that, we
further propose a new raw gradient descent (RGD) algorithm that eliminates the
use of sign. Specifically, we convert the constrained optimization problem into
an unconstrained one, by introducing a new hidden variable of non-clipped
perturbation that can move beyond the constraint. The effectiveness of the
proposed RGD algorithm has been demonstrated extensively in experiments,
outperforming PGD and other competitors in various settings, without incurring
any additional computational overhead. The codes is available in
https://github.com/JunjieYang97/RGD.
外部データセット
CIFAR-10
CIFAR-100
ImageNet
参考文献
Advances in Neural Information Processing Systems
Efficient and effective augmentation strategy for adversarial training
Sravanti Addepalli, Samyak Jain
Published: 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops
The role of’sign’and’direction’of gradient on the performance of cnn
A. Agarwal, R. Singh, M. Vatsa
Published: 2020
ICLR
Sign bits are all you need for black-box attacks
Abdullah Al-Dujaili, Una-May O’Reilly
Published: 2019
Conference on Neural Information Processing Systems
Understanding and improving fast adversarial training