It is by now well-known that small adversarial perturbations can induce
classification errors in deep neural networks. In this paper, we take a
bottom-up signal processing perspective to this problem and show that a
systematic exploitation of sparsity in natural data is a promising tool for
defense. For linear classifiers, we show that a sparsifying front end is
provably effective against $\ell_{\infty}$-bounded attacks, reducing output
distortion due to the attack by a factor of roughly $K/N$ where $N$ is the data
dimension and $K$ is the sparsity level. We then extend this concept to deep
networks, showing that a "locally linear" model can be used to develop a
theoretical foundation for crafting attacks and defenses. We also devise
attacks based on the locally linear model that outperform the well-known FGSM
attack. We supplement our theoretical results with experiments on the MNIST and
CIFAR-10 datasets, showing the efficacy of the proposed sparsity-based defense
schemes.