These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Deep neural networks are susceptible to \emph{adversarial} attacks. In
computer vision, well-crafted perturbations to images can cause neural networks
to make mistakes such as confusing a cat with a computer. Previous adversarial
attacks have been designed to degrade performance of models or cause machine
learning models to produce specific outputs chosen ahead of time by the
attacker. We introduce attacks that instead {\em reprogram} the target model to
perform a task chosen by the attacker---without the attacker needing to specify
or compute the desired output for each test-time input. This attack finds a
single adversarial perturbation, that can be added to all test-time inputs to a
machine learning model in order to cause the model to perform a task chosen by
the adversary---even if the model was not trained to do this task. These
perturbations can thus be considered a program for the new task. We demonstrate
adversarial reprogramming on six ImageNet classification models, repurposing
these models to perform a counting task, as well as classification tasks:
classification of MNIST and CIFAR-10 examples presented as inputs to the
ImageNet model.