Adversarial examples pose a security risk as they can alter decisions of a
machine learning classifier through slight input perturbations. Certified
robustness has been proposed as a mitigation where given an input $\mathbf{x}$,
a classifier returns a prediction and a certified radius $R$ with a provable
guarantee that any perturbation to $\mathbf{x}$ with $R$-bounded norm will not
alter the classifier's prediction. In this work, we show that these guarantees
can be invalidated due to limitations of floating-point representation that
cause rounding errors. We design a rounding search method that can efficiently
exploit this vulnerability to find adversarial examples against
state-of-the-art certifications in two threat models, that differ in how the
norm of the perturbation is computed. We show that the attack can be carried
out against linear classifiers that have exact certifiable guarantees and
against neural networks that have conservative certifications. In the weak
threat model, our experiments demonstrate attack success rates over 50% on
random linear classifiers, up to 23% on the MNIST dataset for linear SVM, and
up to 15% for a neural network. In the strong threat model, the success rates
are lower but positive. The floating-point errors exploited by our attacks can
range from small to large (e.g., $10^{-13}$ to $10^{3}$) - showing that even
negligible errors can be systematically exploited to invalidate guarantees
provided by certified robustness. Finally, we propose a formal mitigation
approach based on rounded interval arithmetic, encouraging future
implementations of robustness certificates to account for limitations of modern
computing architecture to provide sound certifiable guarantees.