These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Machine Learning (ML) models have been utilized for malware detection for
over two decades. Consequently, this ignited an ongoing arms race between
malware authors and antivirus systems, compelling researchers to propose
defenses for malware-detection models against evasion attacks. However, most if
not all existing defenses against evasion attacks suffer from sizable
performance degradation and/or can defend against only specific attacks, which
makes them less practical in real-world settings. In this work, we develop a
certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the
de-randomized smoothing technique for the domain of malware detection.
Specifically, we propose a window ablation scheme to provably limit the impact
of adversarial bytes while maximally preserving local structures of the
executables. After showing how DRSM is theoretically robust against attacks
with contiguous adversarial bytes, we verify its performance and certified
robustness experimentally, where we observe only marginal accuracy drops as the
cost of robustness. To our knowledge, we are the first to offer certified
robustness in the realm of static detection of malware executables. More
surprisingly, through evaluating DRSM against 9 empirical attacks of different
types, we observe that the proposed defense is empirically robust to some
extent against a diverse set of attacks, some of which even fall out of the
scope of its original threat model. In addition, we collected 15.5K recent
benign raw executables from diverse sources, which will be made public as a
dataset called PACE (Publicly Accessible Collection(s) of Executables) to
alleviate the scarcity of publicly available benign datasets for studying
malware detection and provide future research with more representative data of
the time.