The growing interest for adversarial examples, i.e. maliciously modified
examples which fool a classifier, has resulted in many defenses intended to
detect them, render them inoffensive or make the model more robust against
them. In this paper, we pave the way towards a new approach to improve the
robustness of a model against black-box transfer attacks. A removable
additional neural network is included in the target model, and is designed to
induce the \textit{luring effect}, which tricks the adversary into choosing
false directions to fool the target model. Training the additional model is
achieved thanks to a loss function acting on the logits sequence order. Our
deception-based method only needs to have access to the predictions of the
target model and does not require a labeled data set. We explain the luring
effect thanks to the notion of robust and non-robust useful features and
perform experiments on MNIST, SVHN and CIFAR10 to characterize and evaluate
this phenomenon. Additionally, we discuss two simple prediction schemes, and
verify experimentally that our approach can be used as a defense to efficiently
thwart an adversary using state-of-the-art attacks and allowed to perform large
perturbations.