Recreating cyber-attack alert data with a high level of fidelity is
challenging due to the intricate interaction between features, non-homogeneity
of alerts, and potential for rare yet critical samples. Generative Adversarial
Networks (GANs) have been shown to effectively learn complex data distributions
with the intent of creating increasingly realistic data. This paper presents
the application of GANs to cyber-attack alert data and shows that GANs not only
successfully learn to generate realistic alerts, but also reveal feature
dependencies within alerts. This is accomplished by reviewing the intersection
of histograms for varying alert-feature combinations between the ground truth
and generated datsets. Traditional statistical metrics, such as conditional and
joint entropy, are also employed to verify the accuracy of these dependencies.
Finally, it is shown that a Mutual Information constraint on the network can be
used to increase the generation of low probability, critical, alert values. By
mapping alerts to a set of attack stages it is shown that the output of these
low probability alerts has a direct contextual meaning for Cyber Security
analysts. Overall, this work provides the basis for generating new cyber
intrusion alerts and provides evidence that synthesized alerts emulate critical
dependencies from the source dataset.