With False Friends Like These, Who Can Notice Mistakes?

TOP Literature Database With False Friends Like These, Who Can Notice Mistakes?

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2012.14738

PDF

https://arxiv.org/pdf/2012.14738

Paper Information

Author: Lue Tao;Lei Feng;Jinfeng Yi;Songcan Chen
Published: 12-29-2020
Updated: 12-14-2021
Affiliation: College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics
Country: China
Conference: AAAI Conference on Artificial Intelligence (AAAI)

Labels Estimated by AI

Adversarial Example Adversarial Learning Defense Mechanism

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Adversarial examples crafted by an explicit adversary have attracted significant attention in machine learning. However, the security risk posed by a potential false friend has been largely overlooked. In this paper, we unveil the threat of hypocritical examples -- inputs that are originally misclassified yet perturbed by a false friend to force correct predictions. While such perturbed examples seem harmless, we point out for the first time that they could be maliciously used to conceal the mistakes of a substandard (i.e., not as good as required) model during an evaluation. Once a deployer trusts the hypocritical performance and applies the "well-performed" model in real-world applications, unexpected failures may happen even in benign environments. More seriously, this security risk seems to be pervasive: we find that many types of substandard models are vulnerable to hypocritical examples across multiple datasets. Furthermore, we provide the first attempt to characterize the threat with a metric called hypocritical risk and try to circumvent it via several countermeasures. Results demonstrate the effectiveness of the countermeasures, while the risk remains non-negligible even after adaptive robust training.

External Datasets

CIFAR-10

SVHN

CIFAR-100

Tiny-ImageNet