These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Adversarial Training (AT) has been demonstrated to improve the robustness of
deep neural networks (DNNs) against adversarial attacks. AT is a min-max
optimization procedure where in adversarial examples are generated to train a
more robust DNN. The inner maximization step of AT increases the losses of
inputs with respect to their actual classes. The outer minimization involves
minimizing the losses on the adversarial examples obtained from the inner
maximization. This work proposes a standard-deviation-inspired (SDI)
regularization term to improve adversarial robustness and generalization. We
argue that the inner maximization in AT is similar to minimizing a modified
standard deviation of the model's output probabilities. Moreover, we suggest
that maximizing this modified standard deviation can complement the outer
minimization of the AT framework. To support our argument, we experimentally
show that the SDI measure can be used to craft adversarial examples.
Additionally, we demonstrate that combining the SDI regularization term with
existing AT variants enhances the robustness of DNNs against stronger attacks,
such as CW and Auto-attack, and improves generalization.