Towards Robust Training of Neural Networks by Regularizing Adversarial Gradients

TOP 文献データベース Towards Robust Training of Neural Networks by Regularizing Adversarial Gradients

Computing Research Repository (CoRR)

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1805.09370

PDF

https://arxiv.org/pdf/1805.09370

文献情報

作者: Fuxun Yu,Zirui Xu,Yanzhi Wang,Chenchen Liu,Xiang Chen
公開日: 2025-3-25
所属機関: Department of Electrical Computer Engineering, George Mason University
所属の国: United States of America
会議名: Computing Research Repository (CoRR)

AIにより推定されたラベル

モデルの堅牢性敵対的学習敵対的攻撃検出

Abstract

In recent years, neural networks have demonstrated outstanding effectiveness in a large amount of applications.However, recent works have shown that neural networks are susceptible to adversarial examples, indicating possible flaws intrinsic to the network structures. To address this problem and improve the robustness of neural networks, we investigate the fundamental mechanisms behind adversarial examples and propose a novel robust training method via regulating adversarial gradients. The regulation effectively squeezes the adversarial gradients of neural networks and significantly increases the difficulty of adversarial example generation.Without any adversarial example involved, the robust training method could generate naturally robust networks, which are near-immune to various types of adversarial examples. Experiments show the naturally robust networks can achieve optimal accuracy against Fast Gradient Sign Method (FGSM) and C\&W attacks on MNIST, Cifar10, and Google Speech Command dataset. Moreover, our proposed method also provides neural networks with consistent robustness against transferable attacks.