Wide adoption of artificial neural networks in various domains has led to an
increasing interest in defending adversarial attacks against them.
Preprocessing defense methods such as pixel discretization are particularly
attractive in practice due to their simplicity, low computational overhead, and
applicability to various systems. It is observed that such methods work well on
simple datasets like MNIST, but break on more complicated ones like ImageNet
under recently proposed strong white-box attacks. To understand the conditions
for success and potentials for improvement, we study the pixel discretization
defense method, including more sophisticated variants that take into account
the properties of the dataset being discretized. Our results again show poor
resistance against the strong attacks. We analyze our results in a theoretical
framework and offer strong evidence that pixel discretization is unlikely to
work on all but the simplest of the datasets. Furthermore, our arguments
present insights why some other preprocessing defenses may be insecure.