A Comparison of Adversarial Learning Techniques for Malware Detection

TOP Literature Database A Comparison of Adversarial Learning Techniques for Malware Detection

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2308.09958

PDF

https://arxiv.org/pdf/2308.09958

Paper Information

Author: Pavla Louthánová;Matouš Kozák;Martin Jureček;Mark Stamp
Published: 8-19-2023
Affiliation: Faculty of Information Technology, Czech Technical University in Prague
Country: Czechia
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Adversarial Example Malware Detection Adversarial attack

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Machine learning has proven to be a useful tool for automated malware detection, but machine learning models have also been shown to be vulnerable to adversarial attacks. This article addresses the problem of generating adversarial malware samples, specifically malicious Windows Portable Executable files. We summarize and compare work that has focused on adversarial machine learning for malware detection. We use gradient-based, evolutionary algorithm-based, and reinforcement-based methods to generate adversarial samples, and then test the generated samples against selected antivirus products. We compare the selected methods in terms of accuracy and practical applicability. The results show that applying optimized modifications to previously detected malware can lead to incorrect classification of the file as benign. It is also known that generated malware samples can be successfully used against detection models other than those used to generate them and that using combinations of generators can create new samples that evade detection. Experiments show that the Gym-malware generator, which uses a reinforcement learning approach, has the greatest practical potential. This generator achieved an average sample generation time of 5.73 seconds and the highest average evasion rate of 44.11%. Using the Gym-malware generator in combination with itself improved the evasion rate to 58.35%.

External Datasets

3,625 harmless executable files from a newly installed Windows 11 system

3,625 malicious executable files from the VirusShare repository

3,000 malicious samples for Gym-malware model

1,000 validation files for Gym-malware model