Adversarial Vulnerability Bounds for Gaussian Process Classification

Authors: Michael Thomas Smith, Kathrin Grosse, Michael Backes, Mauricio A Alvarez | Published: 2019-09-19

2019.09.192025.05.28

Authors: Michael Thomas Smith, Kathrin Grosse, Michael Backes, Mauricio A Alvarez
Published: 2019-09-19

Source: https://arxiv.org/abs/1909.08864

PDF: https://arxiv.org/pdf/1909.08864

Labels Predicted by AI

Adversarial Example Taxonomy of Attacks Machine Learning Technology

Please note that these labels were automatically added by AI. Therefore, they may not be entirely accurate.
For more details, please see the About the Literature Database page.

Abstract

Machine learning (ML) classification is increasingly used in safety-critical systems. Protecting ML classifiers from adversarial examples is crucial. We propose that the main threat is that of an attacker perturbing a confidently classified input to produce a confident misclassification. To protect against this we devise an adversarial bound (AB) for a Gaussian process classifier, that holds for the entire input domain, bounding the potential for any future adversarial method to cause such misclassification. This is a formal guarantee of robustness, not just an empirically derived result. We investigate how to configure the classifier to maximise the bound, including the use of a sparse approximation, leading to the method producing a practical, useful and provably robust classifier, which we test using a variety of datasets.