There is a rising interest in studying the robustness of deep neural network
classifiers against adversaries, with both advanced attack and defence
techniques being actively developed. However, most recent work focuses on
discriminative classifiers, which only model the conditional distribution of
the labels given the inputs. In this paper, we propose and investigate the deep
Bayes classifier, which improves classical naive Bayes with conditional deep
generative models. We further develop detection methods for adversarial
examples, which reject inputs with low likelihood under the generative model.
Experimental results suggest that deep Bayes classifiers are more robust than
deep discriminative classifiers, and that the proposed detection methods are
effective against many recently proposed attacks.