We introduce an approach for training Variational Autoencoders (VAEs) that
are certifiably robust to adversarial attack. Specifically, we first derive
actionable bounds on the minimal size of an input perturbation required to
change a VAE's reconstruction by more than an allowed amount, with these bounds
depending on certain key parameters such as the Lipschitz constants of the
encoder and decoder. We then show how these parameters can be controlled,
thereby providing a mechanism to ensure \textit{a priori} that a VAE will
attain a desired level of robustness. Moreover, we extend this to a complete
practical approach for training such VAEs to ensure our criteria are met.
Critically, our method allows one to specify a desired level of robustness
\emph{upfront} and then train a VAE that is guaranteed to achieve this
robustness. We further demonstrate that these Lipschitz--constrained VAEs are
more robust to attack than standard VAEs in practice.