AIにより推定されたラベル
※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。
Abstract
Machine Learning models have been shown to be vulnerable to adversarial examples, ie. the manipulation of data by a attacker to defeat a defender’s classifier at test time. We present a novel probabilistic definition of adversarial examples in perfect or limited knowledge setting using prior probability distributions on the defender’s classifier. Using the asymptotic properties of the logistic regression, we derive a closed-form expression of the intensity of any adversarial perturbation, in order to achieve a given expected misclassification rate. This technique is relevant in a threat model of known model specifications and unknown training data. To our knowledge, this is the first method that allows an attacker to directly choose the probability of attack success. We evaluate our approach on two real-world datasets.