These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Similarity has been applied to a wide range of security applications,
typically used in machine learning models. We examine the problem posed by
masquerading samples; that is samples crafted by bad actors to be similar or
near identical to legitimate samples. We find that these samples potentially
create significant problems for machine learning solutions. The primary problem
being that bad actors can circumvent machine learning solutions by using
masquerading samples.
We then examine the interplay between digital signatures and machine learning
solutions. In particular, we focus on executable files and code signing. We
offer a taxonomy for masquerading files. We use a combination of similarity and
clustering to find masquerading files. We use the insights gathered in this
process to offer improvements to similarity based and machine learning security
solutions.