Uncertainty quantification in neural networks gained a lot of attention in
the past years. The most popular approaches, Bayesian neural networks (BNNs),
Monte Carlo dropout, and deep ensembles have one thing in common: they are all
based on some kind of mixture model. While the BNNs build infinite mixture
models and are derived via variational inference, the latter two build finite
mixtures trained with the maximum likelihood method. In this work we
investigate the effect of training an infinite mixture distribution with the
maximum likelihood method instead of variational inference. We find that the
proposed objective leads to stochastic networks with an increased predictive
variance, which improves uncertainty based identification of
miss-classification and robustness against adversarial attacks in comparison to
a standard BNN with equivalent network structure. The new model also displays
higher entropy on out-of-distribution data.