Model stealing attacks have been successfully used in many machine learning
domains, but there is little understanding of how these attacks work against
models that perform malware detection. Malware detection and, in general,
security domains have unique conditions. In particular, there are very strong
requirements for low false positive rates (FPR). Antivirus products (AVs) that
use machine learning are very complex systems to steal, malware binaries
continually change, and the whole environment is adversarial by nature. This
study evaluates active learning model stealing attacks against publicly
available stand-alone machine learning malware classifiers and also against
antivirus products. The study proposes a new neural network architecture for
surrogate models (dualFFNN) and a new model stealing attack that combines
transfer and active learning for surrogate creation (FFNN-TL). We achieved good
surrogates of the stand-alone classifiers with up to 99\% agreement with the
target models, using less than 4% of the original training dataset. Good
surrogates of AV systems were also trained with up to 99% agreement and less
than 4,000 queries. The study uses the best surrogates to generate adversarial
malware to evade the target models, both stand-alone and AVs (with and without
an internet connection). Results show that surrogate models can generate
adversarial malware that evades the targets but with a lower success rate than
directly using the target models to generate adversarial malware. Using
surrogates, however, is still a good option since using the AVs for malware
generation is highly time-consuming and easily detected when the AVs are
connected to the internet.