Despite extensive research on Machine Learning-based Network Intrusion
Detection Systems (ML-NIDS), their capability to detect diverse attack variants
remains uncertain. Prior studies have largely relied on homogeneous datasets,
which artificially inflate performance scores and offer a false sense of
security. Designing systems that can effectively detect a wide range of attack
variants remains a significant challenge. The progress of ML-NIDS continues to
depend heavily on human expertise, which can embed subjective judgments of
system designers into the model, potentially hindering its ability to
generalize across diverse attack types.
To address this gap, we propose KnowML, a framework for knowledge-guided
machine learning that integrates attack knowledge into ML-NIDS. KnowML
systematically explores the threat landscape by leveraging Large Language
Models (LLMs) to perform automated analysis of attack implementations. It
constructs a unified Knowledge Graph (KG) of attack strategies, on which it
applies symbolic reasoning to generate KG-Augmented Input, embedding domain
knowledge directly into the design process of ML-NIDS.
We evaluate KnowML on 28 realistic attack variants, of which 10 are newly
collected for this study. Our findings reveal that baseline ML-NIDS models fail
to detect several variants entirely, achieving F1 scores as low as 0 %. In
contrast, our knowledge-guided approach achieves up to 99 % F1 score while
maintaining a False Positive Rate below 0.1 %.