Though deep neural networks have achieved state-of-the-art performance in
visual classification, recent studies have shown that they are all vulnerable
to the attack of adversarial examples. Small and often imperceptible
perturbations to the input images are sufficient to fool the most powerful deep
neural networks. Various defense methods have been proposed to address this
issue. However, they either require knowledge on the process of generating
adversarial examples, or are not robust against new attacks specifically
designed to penetrate the existing defense. In this work, we introduce
key-based network, a new detection-based defense mechanism to distinguish
adversarial examples from normal ones based on error correcting output codes,
using the binary code vectors produced by multiple binary classifiers applied
to randomly chosen label-sets as signatures to match normal images and reject
adversarial examples. In contrast to existing defense methods, the proposed
method does not require knowledge of the process for generating adversarial
examples and can be applied to defend against different types of attacks. For
the practical black-box and gray-box scenarios, where the attacker does not
know the encoding scheme, we show empirically that key-based network can
effectively detect adversarial examples generated by several state-of-the-art
attacks.