One Bit Matters: Understanding Adversarial Examples as the Abuse of Redundancy

TOP 文献データベース One Bit Matters: Understanding Adversarial Examples as the Abuse of Redundancy

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1810.09650

PDF

https://arxiv.org/pdf/1810.09650

文献情報

作者: Jingkang Wang,Ruoxi Jia,Gerald Friedland,Bo Li,Costas Spanos
公開日: 2018-10-23
所属機関: Shanghai Jiao Tong University
所属の国: China
会議名: Computing Research Repository (CoRR)

AIにより推定されたラベル

敵対的サンプルの検知敵対的移転性ロバスト推定

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Despite the great success achieved in machine learning (ML), adversarial examples have caused concerns with regards to its trustworthiness: A small perturbation of an input results in an arbitrary failure of an otherwise seemingly well-trained ML model. While studies are being conducted to discover the intrinsic properties of adversarial examples, such as their transferability and universality, there is insufficient theoretic analysis to help understand the phenomenon in a way that can influence the design process of ML experiments. In this paper, we deduce an information-theoretic model which explains adversarial attacks as the abuse of feature redundancies in ML algorithms. We prove that feature redundancy is a necessary condition for the existence of adversarial examples. Our model helps to explain some major questions raised in many anecdotal studies on adversarial examples. Our theory is backed up by empirical measurements of the information content of benign and adversarial examples on both image and text datasets. Our measurements show that typical adversarial examples introduce just enough redundancy to overflow the decision making of an ML model trained on corresponding benign examples. We conclude with actionable recommendations to improve the robustness of machine learners against adversarial examples.

外部データセット

MNIST

CIFAR-10

IMDB

Reuters2