The convolutional neural network (CNN) architecture is increasingly being
applied to new domains, such as malware detection, where it is able to learn
malicious behavior from raw bytes extracted from executables. These
architectures reach impressive performance with no feature engineering effort
involved, but their robustness against active attackers is yet to be
understood. Such malware detectors could face a new attack vector in the form
of adversarial interference with the classification model. Existing evasion
attacks intended to cause misclassification on test-time instances, which have
been extensively studied for image classifiers, are not applicable because of
the input semantics that prevents arbitrary changes to the binaries. This paper
explores the area of adversarial examples for malware detection. By training an
existing model on a production-scale dataset, we show that some previous
attacks are less effective than initially reported, while simultaneously
highlighting architectural weaknesses that facilitate new attack strategies for
malware classification. Finally, we explore how generalizable different attack
strategies are, the trade-offs when aiming to increase their effectiveness, and
the transferability of single-step attacks.