These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Despite the growing popularity of modern machine learning techniques (e.g.
Deep Neural Networks) in cyber-security applications, most of these models are
perceived as a black-box for the user. Adversarial machine learning offers an
approach to increase our understanding of these models. In this paper we
present an approach to generate explanations for incorrect classifications made
by data-driven Intrusion Detection Systems (IDSs). An adversarial approach is
used to find the minimum modifications (of the input features) required to
correctly classify a given set of misclassified samples. The magnitude of such
modifications is used to visualize the most relevant features that explain the
reason for the misclassification. The presented methodology generated
satisfactory explanations that describe the reasoning behind the
mis-classifications, with descriptions that match expert knowledge. The
advantages of the presented methodology are: 1) applicable to any classifier
with defined gradients. 2) does not require any modification of the classifier
model. 3) can be extended to perform further diagnosis (e.g. vulnerability
assessment) and gain further understanding of the system. Experimental
evaluation was conducted on the NSL-KDD99 benchmark dataset using Linear and
Multilayer perceptron classifiers. The results are shown using intuitive
visualizations in order to improve the interpretability of the results.