Standard-Deviation-Inspired Regularization for Improving Adversarial Robustness

TOP Literature Database Standard-Deviation-Inspired Regularization for Improving Adversarial Robustness

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2412.19947

PDF

https://arxiv.org/pdf/2412.19947

Paper Information

Author: Olukorede Fakorede;Modeste Atsague;Jin Tian
Published: 12-28-2024
Affiliation: Department of Computer Science, Iowa State University
Country: United States of America
Conference: Trans. Mach. Learn. Res.

Labels Estimated by AI

Adversarial Example Adversarial Training

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Adversarial Training (AT) has been demonstrated to improve the robustness of deep neural networks (DNNs) against adversarial attacks. AT is a min-max optimization procedure where in adversarial examples are generated to train a more robust DNN. The inner maximization step of AT increases the losses of inputs with respect to their actual classes. The outer minimization involves minimizing the losses on the adversarial examples obtained from the inner maximization. This work proposes a standard-deviation-inspired (SDI) regularization term to improve adversarial robustness and generalization. We argue that the inner maximization in AT is similar to minimizing a modified standard deviation of the model's output probabilities. Moreover, we suggest that maximizing this modified standard deviation can complement the outer minimization of the AT framework. To support our argument, we experimentally show that the SDI measure can be used to craft adversarial examples. Additionally, we demonstrate that combining the SDI regularization term with existing AT variants enhances the robustness of DNNs against stronger attacks, such as CW and Auto-attack, and improves generalization.

External Datasets

CIFAR-10

CIFAR-100

SVHN

Tiny ImageNet