Intrusion detection has focused primarily on detecting cyberattacks at the
event-level. Since there is such a large volume of network data and attacks are
minimal, machine learning approaches have focused on improving accuracy and
reducing false positives, but this has frequently resulted in overfitting. In
addition, the volume of intrusion detection alerts is large and creates fatigue
in the human analyst who must review them. This research addresses the problems
associated with event-level intrusion detection and the large volumes of
intrusion alerts by applying active learning and cyber situation awareness.
This paper includes the results of two experiments using the UNSW-NB15 dataset.
The first experiment evaluated sampling approaches for querying the oracle, as
part of active learning. It then trained a Random Forest classifier using the
samples and evaluated its results. The second experiment applied cyber
situation awareness by aggregating the detection results of the first
experiment and calculating the probability that a computer system was part of a
cyberattack. This research showed that moving the perspective of event-level
alerts to the probability that a computer system was part of an attack improved
the accuracy of detection and reduced the volume of alerts that a human analyst
would need to review.