Developing robust models against adversarial perturbations has been an active
area of research and many algorithms have been proposed to train individual
robust models. Taking these pretrained robust models, we aim to study whether
it is possible to create an ensemble to further improve robustness. Several
previous attempts tackled this problem by ensembling the soft-label prediction
and have been proved vulnerable based on the latest attack methods. In this
paper, we show that if the robust training loss is diverse enough, a simple
hard-label based voting ensemble can boost the robust error over each
individual model. Furthermore, given a pool of robust models, we develop a
principled way to select which models to ensemble. Finally, to verify the
improved robustness, we conduct extensive experiments to study how to attack a
voting-based ensemble and develop several new white-box attacks. On CIFAR-10
dataset, by ensembling several state-of-the-art pre-trained defense models, our
method can achieve a 59.8% robust accuracy, outperforming all the existing
defensive models without using additional data.