Recently, Convolutional Neural Networks (CNNs) demonstrate a considerable
vulnerability to adversarial attacks, which can be easily misled by adversarial
perturbations. With more aggressive methods proposed, adversarial attacks can
be also applied to the physical world, causing practical issues to various CNN
powered applications. To secure CNNs, adversarial attack detection is
considered as the most critical approach. However, most existing works focus on
superficial patterns and merely search a particular method to differentiate the
adversarial inputs and natural inputs, ignoring the analysis of CNN inner
vulnerability. Therefore, they can only target to specific physical adversarial
attacks, lacking expected versatility to different attacks. To address this
issue, we propose DoPa -- a comprehensive CNN detection methodology for various
physical adversarial attacks. By interpreting the CNN's vulnerability, we find
that non-semantic adversarial perturbations can activate CNN with significantly
abnormal activations and even overwhelm other semantic input patterns'
activations. Therefore, we add a self-verification stage to analyze the
semantics of distinguished activation patterns, which improves the CNN
recognition process. We apply such a detection methodology into both image and
audio CNN recognition scenarios. Experiments show that DoPa can achieve an
average rate of 90% success for image attack detection and 92% success for
audio attack detection.
Announcement:[The original DoPa draft on arXiv was modified and submitted to
a conference already, while this short abstract was submitted only for a
presentation at the KDD 2019 AIoT Workshop.]