How many dimensions are required to find an adversarial example? | AI Security Portal

JA

JA

EN

TOP Literature Database How many dimensions are required to find an adversarial example?

arxiv

How many dimensions are required to find an adversarial example?

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2303.14173

PDF

https://arxiv.org/pdf/2303.14173

Paper Information

Author: Charles Godfrey;Henry Kvinge;Elise Bishoff;Myles Mckay;Davis Brown;Tim Doster;Eleanor Byler
Published: 3-25-2023
Updated: 4-11-2023
Affiliation: Pacific Northwest National Laboratory
Country: United States of America
Conference: CVPR Workshops

Labels Estimated by AI

Adversarial Example Machine Learning Technology Convergence Property

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Past work exploring adversarial vulnerability have focused on situations where an adversary can perturb all dimensions of model input. On the other hand, a range of recent works consider the case where either (i) an adversary can perturb a limited number of input parameters or (ii) a subset of modalities in a multimodal problem. In both of these cases, adversarial examples are effectively constrained to a subspace $V$ in the ambient input space $\mathcal{X}$. Motivated by this, in this work we investigate how adversarial vulnerability depends on $\dim(V)$. In particular, we show that the adversarial success of standard PGD attacks with $\ell^p$ norm constraints behaves like a monotonically increasing function of $\epsilon (\frac{\dim(V)}{\dim \mathcal{X}})^{\frac{1}{q}}$ where $\epsilon$ is the perturbation budget and $\frac{1}{p} + \frac{1}{q} =1$, provided $p > 1$ (the case $p=1$ presents additional subtleties which we analyze in some detail). This functional form can be easily derived from a simple toy linear model, and as such our results land further credence to arguments that adversarial examples are endemic to locally linear models on high dimensional spaces.

External Datasets

MNIST

CIFAR10

ImageNet

References

Optimization with Sparsity-Inducing Penalties

Francis Bach, Rodolphe Jenatton, Julien Mairal, Guillaume Obozinski

Published: 2011

The Annals of Mathematical Statistics

A Note on Quantiles in Large Samples

R. R. Bahadur

Published: 1966

Advances in Neural Information Processing Systems

Adversarial examples in multi-layer random relu networks

P. Bartlett, S. Bubeck, Y. Cherapanamjeri

Published: 2021

Advances in Neural Information Processing Systems

A single gradient step finds adversarial examples on random two-layers neural networks

S. Bubeck, Y. Cherapanamjeri, G. Gidel, R. Tachet des Combes

Published: 2021

2019 International Joint Conference on Neural Networks (IJCNN)

Curse of dimensionality in adversarial examples

Nandish Chattopadhyay, Anupam Chattopadhyay, Sourav Sen Gupta, Michael Kasper

Published: 2019

Intriguing Properties of Adversarial Examples

Ekin D. Cubuk, Barret Zoph, Samuel S. Schoenholz, Quoc V. Le

Published: 11.8.2017

It is becoming increasingly clear that many machine learning classifiers are vulnerable to adversarial examples. In attempting to explain the origin of adversarial examples, previous studies have typically focused on the fact that neural networks operate on high dimensional data, they overfit, or they are too linear. Here we argue that the origin of adversarial examples is primarily due to an inherent uncertainty that neural networks have about their predictions. We show that the functional form of this uncertainty is independent of architecture, dataset, and training protocol; and depends only on the statistics of the logit differences of the network, which do not change significantly during training. This leads to adversarial error having a universal scaling, as a power-law, with respect to the size of the adversarial perturbation. We show that this universality holds for a broad range of datasets (MNIST, CIFAR10, ImageNet, and random data), models (including state-of-the-art deep networks, linear models, adversarially trained networks, and networks trained on randomly shuffled labels), and attacks (FGSM, step l.l., PGD). Motivated by these results, we study the effects of reducing prediction entropy on adversarial robustness. Finally, we study the effect of network architectures on adversarial sensitivity. To do this, we use neural architecture search with reinforcement learning to find adversarially robust architectures on CIFAR10. Our resulting architecture is more robust to white \emph{and} black box attacks compared to previous attempts.

Adversarial attack Adversarial Example Adversarial Learning

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition CVPR

Imagenet: A large-scale hierarchical image database

J. Deng, W. Dong, R. Socher, L. Li, K. Li, L. Fei-Fei

Published: 2009

Origins of low-dimensional adversarial perturbations

Elvis Dohmatob, Chuan Guo, Morgane Goibert

Published: 2022

Machine Learning

Analysis of classifiers’ robustness to adversarial perturbations

A. Fawzi, O. Fawzi, P. Frossard

Published: 2018

Advances in Neural Information Processing Systems

Robustness of classifiers: from adversarial to random noise

Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard

Published: 2016

Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics

Robustness of classifiers to uniform `p and gaussian noise

Jean-Yves Franceschi, Alhussein Fawzi, Omar Fawzi

Published: 2018

Computing Research Repository (CoRR)

Motivating the Rules of the Game for Adversarial Example Research

Justin Gilmer, Ryan P. Adams, Ian Goodfellow, David Andersen, George E. Dahl

Published: 7.18.2018

Advances in machine learning have led to broad deployment of systems with impressive performance on important problems. Nonetheless, these systems can be induced to make errors on data that are surprisingly similar to examples the learned system handles correctly. The existence of these errors raises a variety of questions about out-of-sample generalization and whether bad actors might use such examples to abuse deployed systems. As a result of these security concerns, there has been a flurry of recent papers proposing algorithms to defend against such malicious perturbations of correctly handled examples. It is unclear how such misclassifications represent a different kind of security problem than other errors, or even other attacker-produced examples that have no specific relationship to an uncorrupted input. In this paper, we argue that adversarial example defense papers have, to date, mostly considered abstract, toy games that do not relate to any specific security concern. Furthermore, defense papers have not yet precisely described all the abilities and limitations of attackers that would be relevant in practical security. Towards this end, we establish a taxonomy of motivations, constraints, and abilities for more plausible adversaries. Finally, we provide a series of recommendations outlining a path forward for future work to more clearly articulate the threat model and perform more meaningful evaluation.

Adversarial attack Certified Robustness Adversarial Example

Distill

A Discussion of ’Adversarial Examples Are Not Bugs, They Are Features’

Justin Gilmer, Dan Hendrycks

Published: 2019

International Conference on Learning Representations

Explaining and harnessing adversarial examples

Ian Goodfellow, Jonathon Shlens, Christian Szegedy

Published: 2015

UAI

Low Frequency Adversarial Perturbation

Chuan Guo, Jared S. Frank, Kilian Q. Weinberger

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun

Published: 2016

Measuring the tendency of cnns to learn surface statistical regularities

Jason Jo, Yoshua Bengio

Published: 2017

On the geometry of adversarial examples

Marc Khoury, Dylan Hadfield-Menell

Published: 2018

Learning multiple layers of features from tiny images

Alex Krizhevsky, Geoffrey Hinton

Published: 2009

Proceedings of the 14th international conference on availability, reliability and security

Black box attacks on deep anomaly detectors

Aditya Kuppa, Slawomir Grzonkowski, Muhammad Rizwan Asghar, Nhien-An Le-Khac

Published: 2019

Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019

Functional adversarial attacks

Cassidy Laidlaw, Soheil Feizi

Published: 2019

Mnist handwritten digit database

Yann LeCun, Corinna Cortes, CJ Burges

Published: 1998

Towards Deep Learning Models Resistant to Adversarial Attacks

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, Adrian Vladu

Published: 6.20.2017

Recent work has demonstrated that deep neural networks are vulnerable to adversarial examples---inputs that are almost indistinguishable from natural data and yet classified incorrectly by the network. In fact, some of the latest findings suggest that the existence of adversarial attacks may be an inherent weakness of deep learning models. To address this problem, we study the adversarial robustness of neural networks through the lens of robust optimization. This approach provides us with a broad and unifying view on much of the prior work on this topic. Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal. In particular, they specify a concrete security guarantee that would protect against any adversary. These methods let us train networks with significantly improved resistance to a wide range of adversarial attacks. They also suggest the notion of security against a first-order adversary as a natural and broad security guarantee. We believe that robustness against such well-defined classes of adversaries is an important stepping stone towards fully resistant deep learning models. Code and pre-trained models are available at https://github.com/MadryLab/mnist_challenge and https://github.com/MadryLab/cifar10_challenge.

Certified Robustness Adversarial Example Robustness Evaluation

Proceedings of the 18th ACM International Conference on Multimedia

Torchvision the machine-vision package of torch

Sébastien Marcel, Yann Rodriguez

Published: 2010

Adversarial prompting for black box foundation models

N. Maus, P. Chao, E. Wong, J. Gardner

Published: 2023

CVPR

Deepfool: a simple and accurate method to fool deep neural networks

S.-M. Moosavi-Dezfooli, A. Fawzi, P. Frossard

Published: 2016

Pytorch: An imperative style, high-performance deep learning library

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga

Published: 2019

Proceedings of the 36th International Conference on Machine Learning

On the spectral bias of neural networks

Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred Hamprecht, Yoshua Bengio, Aaron Courville

Published: 2019

International Conference on Learning Representations

Are adversarial examples inevitable?

Ali Shafahi, W. Ronny Huang, Christoph Studer, Soheil Feizi, Tom Goldstein

Published: 2019

Proceedings of the 28th International Joint Conference on Artificial Intelligence

On the effectiveness of low frequency perturbations

Y. Sharma, G. W. Ding, M. A. Brubaker

Published: 2019

Proceedings of the 36th International Conference on Machine Learning

First-order adversarial vulnerability of neural networks and input dimension

Carl-Johann Simon-Gabriel, Yann Ollivier, Leon Bottou, Bernhard Schölkopf, David Lopez-Paz

Published: 2019

Disentangling adversarial robustness and generalization

David Stutz, Matthias Hein, Bernt Schiele

Published: 2019

ICLR

Intriguing properties of neural networks

C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, R. Fergus

Published: 2014

The Mosaic ML Team

Published: 2021

Computing Research Repository (CoRR)

The Space of Transferable Adversarial Examples

Florian Tramèr, Nicolas Papernot, Ian Goodfellow, Dan Boneh, Patrick McDaniel

Published: 4.12.2017

Adversarial examples are maliciously perturbed inputs designed to mislead machine learning (ML) models at test-time. They often transfer: the same adversarial example fools more than one model. In this work, we propose novel methods for estimating the previously unknown dimensionality of the space of adversarial inputs. We find that adversarial examples span a contiguous subspace of large (~25) dimensionality. Adversarial subspaces with higher dimensionality are more likely to intersect. We find that for two different models, a significant fraction of their subspaces is shared, thus enabling transferability. In the first quantitative analysis of the similarity of different models' decision boundaries, we show that these boundaries are actually close in arbitrary directions, whether adversarial or benign. We conclude by formally studying the limits of transferability. We derive (1) sufficient conditions on the data distribution that imply transferability for simple model classes and (2) examples of scenarios in which transfer does not occur. These findings indicate that it may be possible to design defenses against transfer-based attacks, even for models that are vulnerable to direct attacks.

Adversarial Example Detection Model Robustness Attack Detection

SciPy 1.0: Fundamental algorithms for scientific computing in python

P. Virtanen, R. Gommers, T. E. Oliphant, M. Haberland, T. Reddy, D. Cournapeau, E. Burovski, P. Peterson, W. Weckesser, J. Bright, S. J. van der Walt, M. Brett, J. Wilson, K. J. Millman, N. Mayorov, A. R. J. Nelson, E. Jones, R. Kern, E. Larson, C. J. Carey, I. Polat, Y. Feng, E. W. Moore, J. VanderPlas, D. Laxalde, J. Perk t old, R. Cimrman, I. Henriksen, E. A. Quintero, C. R. Harris, A. M. Archibald, A. H. Ribeiro, F. Pedregosa, P. van Mulbregt, SciPy 1.0 Contributors

Published: 2020

Proceedings of the IEEE international conference on computer vision

Adversarial examples for semantic segmentation and object detection

Cihang Xie, Jianyu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie, Alan Yuille

Published: 2017

Advances in Neural Information Processing Systems

A fourier perspective on model robustness in computer vision

D. Yin, R. Gontijo Lopes, J. Shlens, E. D. Cubuk, J. Gilmer

Published: 2019

BMVC

Adversarial color enhancement: Generating unrestricted adversarial images by optimizing a color filter

Zhengyu Zhao, Zhuoran Liu, Martha A. Larson

Published: 2020