Machine learning has been proven to be susceptible to carefully crafted
samples, known as adversarial examples. The generation of these adversarial
examples helps to make the models more robust and gives us an insight into the
underlying decision-making of these models. Over the years, researchers have
successfully attacked image classifiers in both, white and black-box settings.
However, these methods are not directly applicable to texts as text data is
discrete. In recent years, research on crafting adversarial examples against
textual applications has been on the rise. In this paper, we present a novel
approach for hard-label black-box attacks against Natural Language Processing
(NLP) classifiers, where no model information is disclosed, and an attacker can
only query the model to get a final decision of the classifier, without
confidence scores of the classes involved. Such an attack scenario applies to
real-world black-box models being used for security-sensitive applications such
as sentiment analysis and toxic content detection.