There has been a surge of interest in using machine learning (ML) to
automatically detect malware through their dynamic behaviors. These approaches
have achieved significant improvement in detection rates and lower false
positive rates at large scale compared with traditional malware analysis
methods. ML in threat detection has demonstrated to be a good cop to guard
platform security. However it is imperative to evaluate - is ML-powered
security resilient enough?
In this paper, we juxtapose the resiliency and trustworthiness of ML
algorithms for security, via a case study of evaluating the resiliency of
ransomware detection via the generative adversarial network (GAN). In this case
study, we propose to use GAN to automatically produce dynamic features that
exhibit generalized malicious behaviors that can reduce the efficacy of
black-box ransomware classifiers. We examine the quality of the GAN-generated
samples by comparing the statistical similarity of these samples to real
ransomware and benign software. Further we investigate the latent subspace
where the GAN-generated samples lie and explore reasons why such samples cause
a certain class of ransomware classifiers to degrade in performance. Our focus
is to emphasize necessary defense improvement in ML-based approaches for
ransomware detection before deployment in the wild. Our results and discoveries
should pose relevant questions for defenders such as how ML models can be made
more resilient for robust enforcement of security objectives.