These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
In recent years, enterprises have been targeted by advanced adversaries who
leverage creative ways to infiltrate their systems and move laterally to gain
access to critical data. One increasingly common evasive method is to hide the
malicious activity behind a benign program by using tools that are already
installed on user computers. These programs are usually part of the operating
system distribution or another user-installed binary, therefore this type of
attack is called "Living-Off-The-Land". Detecting these attacks is challenging,
as adversaries may not create malicious files on the victim computers and
anti-virus scans fail to detect them. We propose the design of an Active
Learning framework called LOLAL for detecting Living-Off-the-Land attacks that
iteratively selects a set of uncertain and anomalous samples for labeling by a
human analyst. LOLAL is specifically designed to work well when a limited
number of labeled samples are available for training machine learning models to
detect attacks. We investigate methods to represent command-line text using
word-embedding techniques, and design ensemble boosting classifiers to
distinguish malicious and benign samples based on the embedding representation.
We leverage a large, anonymized dataset collected by an endpoint security
product and demonstrate that our ensemble classifiers achieve an average F1
score of 0.96 at classifying different attack classes. We show that our active
learning method consistently improves the classifier performance, as more
training data is labeled, and converges in less than 30 iterations when
starting with a small number of labeled instances.