Advanced cyber threats (e.g., Fileless Malware and Advanced Persistent Threat
(APT)) have driven the adoption of provenance-based security solutions. These
solutions employ Machine Learning (ML) models for behavioral modeling and
critical security tasks such as malware and anomaly detection. However, the
opacity of ML-based security models limits their broader adoption, as the lack
of transparency in their decision-making processes restricts explainability and
verifiability. We tailored our solution towards Graph Neural Network
(GNN)-based security solutions since recent studies employ GNNs to
comprehensively digest system provenance graphs for security-critical tasks.
To enhance the explainability of GNN-based security models, we introduce
PROVEXPLAINER, a framework offering instance-level security-aware explanations
using an interpretable surrogate model. PROVEXPLAINER's interpretable feature
space consists of discriminant subgraph patterns and graph structural features,
which can be directly mapped to the system provenance problem space, making the
explanations human interpretable. We show how PROVEXPLAINER synergizes with
current state-of-the-art (SOTA) GNN explainers to deliver domain and
instance-specific explanations. We measure the explanation quality using the
Fidelity+/Fidelity- metric as used by traditional GNN explanation literature,
we incorporate the precision/recall metric, where we consider the accuracy of
the explanation against the ground truth, and we designed a human actionability
metric based on graph traversal distance. On real-world Fileless and APT
datasets, PROVEXPLAINER achieves up to 29%/27%/25%/1.4x higher Fidelity+,
precision, recall, and actionability (where higher values are better), and 12%
lower Fidelity- (where lower values are better) when compared against SOTA GNN
explainers.