Towards Certifying L-infinity Robustness using Neural Networks with L-inf-dist Neurons

Authors: Bohang Zhang, Tianle Cai, Zhou Lu, Di He, Liwei Wang | Published: 2021-02-10 | Updated: 2021-06-14

2021.02.102025.04.03

Authors: Bohang Zhang, Tianle Cai, Zhou Lu, Di He, Liwei Wang
Published: 2021-02-10 | Updated: 2021-06-14

Source: https://arxiv.org/abs/2102.05363

PDF: https://arxiv.org/pdf/2102.05363

AIにより推定されたラベル

敵対的サンプルモデル性能評価データセット評価

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

It is well-known that standard neural networks, even with a high classification accuracy, are vulnerable to small ℓ_∞-norm bounded adversarial perturbations. Although many attempts have been made, most previous works either can only provide empirical verification of the defense to a particular attack method, or can only develop a certified guarantee of the model robustness in limited scenarios. In this paper, we seek for a new approach to develop a theoretically principled neural network that inherently resists ℓ_∞ perturbations. In particular, we design a novel neuron that uses ℓ_∞-distance as its basic operation (which we call ℓ_∞-dist neuron), and show that any neural network constructed with ℓ_∞-dist neurons (called ℓ_∞-dist net) is naturally a 1-Lipschitz function with respect to ℓ_∞-norm. This directly provides a rigorous guarantee of the certified robustness based on the margin of prediction outputs. We then prove that such networks have enough expressive power to approximate any 1-Lipschitz function with robust generalization guarantee. We further provide a holistic training strategy that can greatly alleviate optimization difficulties. Experimental results show that using ℓ_∞-dist nets as basic building blocks, we consistently achieve state-of-the-art performance on commonly used datasets: 93.09 accuracy on MNIST (ϵ = 0.3), 35.42 16.31