Towards Certifying L-infinity Robustness using Neural Networks with L-inf-dist Neurons

Authors: Bohang Zhang, Tianle Cai, Zhou Lu, Di He, Liwei Wang | Published: 2021-02-10 | Updated: 2021-06-14

2021.02.102025.05.28

Authors: Bohang Zhang, Tianle Cai, Zhou Lu, Di He, Liwei Wang
Published: 2021-02-10 | Updated: 2021-06-14

Source: https://arxiv.org/abs/2102.05363

PDF: https://arxiv.org/pdf/2102.05363

Labels Predicted by AI

Adversarial Example Model Performance Evaluation Dataset evaluation

Please note that these labels were automatically added by AI. Therefore, they may not be entirely accurate.
For more details, please see the About the Literature Database page.

Abstract

It is well-known that standard neural networks, even with a high classification accuracy, are vulnerable to small ℓ_∞-norm bounded adversarial perturbations. Although many attempts have been made, most previous works either can only provide empirical verification of the defense to a particular attack method, or can only develop a certified guarantee of the model robustness in limited scenarios. In this paper, we seek for a new approach to develop a theoretically principled neural network that inherently resists ℓ_∞ perturbations. In particular, we design a novel neuron that uses ℓ_∞-distance as its basic operation (which we call ℓ_∞-dist neuron), and show that any neural network constructed with ℓ_∞-dist neurons (called ℓ_∞-dist net) is naturally a 1-Lipschitz function with respect to ℓ_∞-norm. This directly provides a rigorous guarantee of the certified robustness based on the margin of prediction outputs. We then prove that such networks have enough expressive power to approximate any 1-Lipschitz function with robust generalization guarantee. We further provide a holistic training strategy that can greatly alleviate optimization difficulties. Experimental results show that using ℓ_∞-dist nets as basic building blocks, we consistently achieve state-of-the-art performance on commonly used datasets: 93.09 accuracy on MNIST (ϵ = 0.3), 35.42 16.31