In recent years, defending adversarial perturbations to natural examples in
order to build robust machine learning models trained by deep neural networks
(DNNs) has become an emerging research field in the conjunction of deep
learning and security. In particular, MagNet consisting of an adversary
detector and a data reformer is by far one of the strongest defenses in the
black-box oblivious attack setting, where the attacker aims to craft
transferable adversarial examples from an undefended DNN model to bypass an
unknown defense module deployed on the same DNN model. Under this setting,
MagNet can successfully defend a variety of attacks in DNNs, including the
high-confidence adversarial examples generated by the Carlini and Wagner's
attack based on the $L_2$ distortion metric. However, in this paper, under the
same attack setting we show that adversarial examples crafted based on the
$L_1$ distortion metric can easily bypass MagNet and mislead the target DNN
image classifiers on MNIST and CIFAR-10. We also provide explanations on why
the considered approach can yield adversarial examples with superior attack
performance and conduct extensive experiments on variants of MagNet to verify
its lack of robustness to $L_1$ distortion based attacks. Notably, our results
substantially weaken the assumption of effective threat models on MagNet that
require knowing the deployed defense technique when attacking DNNs (i.e., the
gray-box attack setting).