These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
National security is threatened by malware, which remains one of the most
dangerous and costly cyber threats. As of last year, researchers reported 1.3
billion known malware specimens, motivating the use of data-driven machine
learning (ML) methods for analysis. However, shortcomings in existing ML
approaches hinder their mass adoption. These challenges include detection of
novel malware and the ability to perform malware classification in the face of
class imbalance: a situation where malware families are not equally represented
in the data. Our work addresses these shortcomings with MalwareDNA: an advanced
dimensionality reduction and feature extraction framework. We demonstrate
stable task performance under class imbalance for the following tasks: malware
family classification and novel malware detection with a trade-off in increased
abstention or reject-option rate.