An ADMM-Based Universal Framework for Adversarial Attacks on Deep Neural Networks

TOP 文献データベース An ADMM-Based Universal Framework for Adversarial Attacks on Deep Neural Networks

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1804.03193

PDF

https://arxiv.org/pdf/1804.03193

文献情報

作者: Pu Zhao,Sijia Liu,Yanzhi Wang,Xue Lin
公開日: 2025-3-26
所属機関: Department of ECE, Northeastern University
所属の国: United States of America
会議名

AIにより推定されたラベル

敵対的攻撃敵対的サンプルバックドアモデルの検知

Abstract

Deep neural networks (DNNs) are known vulnerable to adversarial attacks. That is, adversarial examples, obtained by adding delicately crafted distortions onto original legal inputs, can mislead a DNN to classify them as any target labels. In a successful adversarial attack, the targeted mis-classification should be achieved with the minimal distortion added. In the literature, the added distortions are usually measured by L0, L1, L2, and L infinity norms, namely, L0, L1, L2, and L infinity attacks, respectively. However, there lacks a versatile framework for all types of adversarial attacks. This work for the first time unifies the methods of generating adversarial examples by leveraging ADMM (Alternating Direction Method of Multipliers), an operator splitting optimization approach, such that L0, L1, L2, and L infinity attacks can be effectively implemented by this general framework with little modifications. Comparing with the state-of-the-art attacks in each category, our ADMM-based attacks are so far the strongest, achieving both the 100% attack success rate and the minimal distortion.