These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
With the growing popularity of artificial intelligence and machine learning,
a wide spectrum of attacks against deep learning models have been proposed in
the literature. Both the evasion attacks and the poisoning attacks attempt to
utilize adversarially altered samples to fool the victim model to misclassify
the adversarial sample. While such attacks claim to be or are expected to be
stealthy, i.e., imperceptible to human eyes, such claims are rarely evaluated.
In this paper, we present the first large-scale study on the stealthiness of
adversarial samples used in the attacks against deep learning. We have
implemented 20 representative adversarial ML attacks on six popular
benchmarking datasets. We evaluate the stealthiness of the attack samples using
two complementary approaches: (1) a numerical study that adopts 24 metrics for
image similarity or quality assessment; and (2) a user study of 3 sets of
questionnaires that has collected 20,000+ annotations from 1,000+ responses.
Our results show that the majority of the existing attacks introduce
nonnegligible perturbations that are not stealthy to human eyes. We further
analyze the factors that contribute to attack stealthiness. We further examine
the correlation between the numerical analysis and the user studies, and
demonstrate that some image quality metrics may provide useful guidance in
attack designs, while there is still a significant gap between assessed image
quality and visual stealthiness of attacks.