Machine learning (ML) techniques are increasingly common in security
applications, such as malware and intrusion detection. However, ML models are
often susceptible to evasion attacks, in which an adversary makes changes to
the input (such as malware) in order to avoid being detected. A conventional
approach to evaluate ML robustness to such attacks, as well as to design robust
ML, is by considering simplified feature-space models of attacks, where the
attacker changes ML features directly to effect evasion, while minimizing or
constraining the magnitude of this change. We investigate the effectiveness of
this approach to designing robust ML in the face of attacks that can be
realized in actual malware (realizable attacks). We demonstrate that in the
context of structure-based PDF malware detection, such techniques appear to
have limited effectiveness, but they are effective with content-based
detectors. In either case, we show that augmenting the feature space models
with conserved features (those that cannot be unilaterally modified without
compromising malicious functionality) significantly improves performance.
Finally, we show that feature space models enable generalized robustness when
faced with a variety of realizable attacks, as compared to classifiers which
are tuned to be robust to a specific realizable attack.