Certifiable Black-Box Attacks with Randomized Adversarial Examples: Breaking Defenses with Provable Confidence

ICLR

Sign bits are all you need for black-box attacks

Abdullah Al-Dujaili, Una-May O’Reilly

Published: 2019

arxiv

Cited by 9

European Conference on Computer Vision (ECCV)

Square Attack: a query-efficient black-box adversarial attack via random search

Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion, Matthias Hein

Published: 11.30.2019

We propose the Square Attack, a score-based black-box $l_2$- and $l_\infty$-adversarial attack that does not rely on local gradient information and thus is not affected by gradient masking. Square Attack is based on a randomized search scheme which selects localized square-shaped updates at random positions so that at each iteration the perturbation is situated approximately at the boundary of the feasible set. Our method is significantly more query efficient and achieves a higher success rate compared to the state-of-the-art methods, especially in the untargeted setting. In particular, on ImageNet we improve the average query efficiency in the untargeted setting for various deep networks by a factor of at least $1.8$ and up to $3$ compared to the recent state-of-the-art $l_\infty$-attack of Al-Dujaili & O'Reilly. Moreover, although our attack is black-box, it can also outperform gradient-based white-box attacks on the standard benchmarks achieving a new state-of-the-art in terms of the success rate. The code of our attack is available at https://github.com/max-andr/square-attack.

Adversarial Attack Detection Adversarial Training Model Evaluation

Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research

Sorting out Lipschitz function approximation

C. Anil, J. Lucas, R. Grosse

Published: 2019

International conference on machine learning

Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples

Anish Athalye, Nicholas Carlini, David Wagner

Published: 2018

ECCV

Practical black-box attacks on deep neural networks using efficient query mechanisms

Arjun Nitin Bhagoji, Warren He, Bo Li, Dawn Song

Published: 2018

arxiv

Cited by 1

A Survey of Black-Box Adversarial Attacks on Computer Vision Models

Siddhant Bhambri, Sumanyu Muku, Avinash Tulasi, Arun Balaji Buduru

Published: 12.4.2019

Machine learning has seen tremendous advances in the past few years, which has lead to deep learning models being deployed in varied applications of day-to-day life. Attacks on such models using perturbations, particularly in real-life scenarios, pose a severe challenge to their applicability, pushing research into the direction which aims to enhance the robustness of these models. After the introduction of these perturbations by Szegedy et al. [1], significant amount of research has focused on the reliability of such models, primarily in two aspects - white-box, where the adversary has access to the targeted model and related parameters; and the black-box, which resembles a real-life scenario with the adversary having almost no knowledge of the model to be attacked. To provide a comprehensive security cover, it is essential to identify, study, and build defenses against such attacks. Hence, in this paper, we propose to present a comprehensive comparative study of various black-box adversarial attacks and defense techniques.

Effectiveness Analysis of Defense Methods Poisoning Vulnerability of Adversarial Examples

arxiv

Cited by 1

International Conference on Learning Representations (ICLR)

Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models

Wieland Brendel, Jonas Rauber, Matthias Bethge

Published: 12.12.2017

Many machine learning algorithms are vulnerable to almost imperceptible perturbations of their inputs. So far it was unclear how much risk adversarial perturbations carry for the safety of real-world machine learning applications because most methods used to generate such perturbations rely either on detailed model information (gradient-based attacks) or on confidence scores such as class probabilities (score-based attacks), neither of which are available in most real-world scenarios. In many such cases one currently needs to retreat to transfer-based attacks which rely on cumbersome substitute models, need access to the training data and can be defended against. Here we emphasise the importance of attacks which solely rely on the final model decision. Such decision-based attacks are (1) applicable to real-world black-box models such as autonomous cars, (2) need less knowledge and are easier to apply than transfer-based attacks and (3) are more robust to simple defences than gradient- or score-based attacks. Previous attacks in this category were limited to simple models or simple datasets. Here we introduce the Boundary Attack, a decision-based attack that starts from a large adversarial perturbation and then seeks to reduce the perturbation while staying adversarial. The attack is conceptually simple, requires close to no hyperparameter tuning, does not rely on substitute models and is competitive with the best gradient-based attacks in standard computer vision tasks like ImageNet. We apply the attack on two black-box algorithms from Clarifai.com. The Boundary Attack in particular and the class of decision-based attacks in general open new avenues to study the robustness of machine learning models and raise new questions regarding the safety of deployed machine learning systems. An implementation of the attack is available as part of Foolbox at https://github.com/bethgelab/foolbox .

Adversarial Learning Adversarial Attack Methods Certified Robustness

arxiv

Cited by 1

IEEE/CVF International Conference on Computer Vision (ICCV)

Guessing Smart: Biased Sampling for Efficient Black-Box Adversarial Attacks

Thomas Brunner, Frederik Diehl, Michael Truong Le, Alois Knoll

Published: 12.24.2018

We consider adversarial examples for image classification in the black-box decision-based setting. Here, an attacker cannot access confidence scores, but only the final label. Most attacks for this scenario are either unreliable or inefficient. Focusing on the latter, we show that a specific class of attacks, Boundary Attacks, can be reinterpreted as a biased sampling framework that gains efficiency from domain knowledge. We identify three such biases, image frequency, regional masks and surrogate gradients, and evaluate their performance against an ImageNet classifier. We show that the combination of these biases outperforms the state of the art by a wide margin. We also showcase an efficient way to attack the Google Cloud Vision API, where we craft convincing perturbations with just a few hundred queries. Finally, the methods we propose have also been found to work very well against strong defenses: Our targeted attack won second place in the NeurIPS 2018 Adversarial Vision Challenge.

Adversarial Example Detection Certified Robustness Robustness

International conference on learning representations

Thermometer encoding: One hot way to resist adversarial examples

Jacob Buckman, Aurko Roy, Colin Raffel, Ian Goodfellow

Published: 2018

Proceedings of the IEEE

Speaker recognition: A tutorial

Joseph P Campbell

Published: 1997

The Eleventh International Conference on Learning Representations, ICLR

(certified!!) adversarial robustness for free!

Carlini, N., Tramer, F., Dvijotham, K. D., Rice, L., Sun, ` M., Kolter, J. Z.

Published: 2023