Deep neural networks (DNNs) have transformed several artificial intelligence
research areas including computer vision, speech recognition, and natural
language processing. However, recent studies demonstrated that DNNs are
vulnerable to adversarial manipulations at testing time. Specifically, suppose
we have a testing example, whose label can be correctly predicted by a DNN
classifier. An attacker can add a small carefully crafted noise to the testing
example such that the DNN classifier predicts an incorrect label, where the
crafted testing example is called adversarial example. Such attacks are called
evasion attacks. Evasion attacks are one of the biggest challenges for
deploying DNNs in safety and security critical applications such as
self-driving cars. In this work, we develop new methods to defend against
evasion attacks. Our key observation is that adversarial examples are close to
the classification boundary. Therefore, we propose region-based classification
to be robust to adversarial examples. For a benign/adversarial testing example,
we ensemble information in a hypercube centered at the example to predict its
label. In contrast, traditional classifiers are point-based classification,
i.e., given a testing example, the classifier predicts its label based on the
testing example alone. Our evaluation results on MNIST and CIFAR-10 datasets
demonstrate that our region-based classification can significantly mitigate
evasion attacks without sacrificing classification accuracy on benign examples.
Specifically, our region-based classification achieves the same classification
accuracy on testing benign examples as point-based classification, but our
region-based classification is significantly more robust than point-based
classification to various evasion attacks.