Cross-Input Certified Training for Universal Perturbations

International conference on machine learning

Synthesizing robust adversarial examples

Athalye, A., Engstrom, L., Ilyas, A., Kwok, K.

Published: 2018

8th International Conference on Learning Representations (ICLR 2020)(virtual)

Adversarial training and provable defenses: Bridging the gap

Balunović, M., Vechev, M.

Published: 2020

Relational dnn verification with cross executional bound refinement

Banerjee, D., Singh, G.

Published: 2024

Proc. ACM Program. Lang. 8(PLDI)

Input-relational verification of deep neural networks

Banerjee, D., Xu, C., Singh, G.

Published: 2024

2021 IEEE International Conference on Multimedia and Expo (ICME)

Universal adversarial training with class-wise perturbations

Benz, P., Zhang, C., Karjauv, A., Kweon, I.S.

Published: 2021

The Eleventh International Conference on Learning Representations, ICLR

(certified!!) adversarial robustness for free!

Carlini, N., Tramer, F., Dvijotham, K. D., Rice, L., Sun, ` M., Kolter, J. Z.

Published: 2023

arxiv

Cited by 1

IEEE Symposium on Security and Privacy

Towards Evaluating the Robustness of Neural Networks

Nicholas Carlini, David Wagner

Published: 8.17.2016

Neural networks provide state-of-the-art results for most machine learning tasks. Unfortunately, neural networks are vulnerable to adversarial examples: given an input $x$ and any target classification $t$, it is possible to find a new input $x'$ that is similar to $x$ but classified as $t$. This makes it difficult to apply neural networks in security-critical areas. Defensive distillation is a recently proposed approach that can take an arbitrary neural network, and increase its robustness, reducing the success rate of current attacks' ability to find adversarial examples from $95\%$ to $0.5\%$. In this paper, we demonstrate that defensive distillation does not significantly increase the robustness of neural networks by introducing three new attack algorithms that are successful on both distilled and undistilled neural networks with $100\%$ probability. Our attacks are tailored to three distance metrics used previously in the literature, and when compared to previous adversarial example generation algorithms, our attacks are often much more effective (and never worse). Furthermore, we propose using high-confidence adversarial examples in a simple transferability test we show can also be used to break defensive distillation. We hope our attacks will be used as a benchmark in future defense attempts to create neural networks that resist adversarial examples.

Model Robustness Adversarial Example Certified Robustness

ICML

Certified adversarial robustness via randomized smoothing

J. Cohen, E. Rosenfeld, Z. Kolter

Published: 2019

Neural Information Processing Systems

Enabling certification of verification-agnostic networks via memory-efficient semidefinite programming

Sumanth Dathathri, Krishnamurthy (Dj) Dvijotham, Alex Kurakin, Aditi Raghunathan, Jonathan Uesato, Rudy Bunel, Shreya Shankar, Jacob Steinhardt, Ian Goodfellow, Percy Liang, Pushmeet Kohli

Published: 2020

International Conference on Learning Representations

Scaling the convex barrier with active sets

Alessandro De Palma, Harkirat Singh Behl, Rudy Bunel, Philip H. S. Torr, M. Pawan Kumar

Published: 2021